ropensci / unconf14

Repo to brainstorm ideas (unconference style) for the rOpenSci hackathon.
28 stars 3 forks source link

Project proposal: Indexing citations of R code #29

Open jure opened 10 years ago

jure commented 10 years ago

We've just finished the #CW14 hackathon, building a thing that tries to index as much open scientific software as possible, while also building a list of citations of said software. It builds a database of repository URL -> DOI links, e.g. http://sciencetoolbox.org/tools/65 and builds user profiles (organizations not really supported yet), e.g. http://sciencetoolbox.org/github/najoshi

Right now it's a bit funky, some citations are incorrectly attributed, etc. I'd like to improve that and add more sources which can be consumed daily. The plan we executed for the hack day is here: https://github.com/jure/sciencetoolbox/issues/11 and in a lot of ways, this would be a continuation of that work.

It was perhaps a task too big for a one day hackathon, but we did manage to index 1500+ tools and 1600+ citations from Google Scholar and EuropePMC.

Seeing as this is an R hackathon, I could focus on R code, of which there is already quite a few in the index: http://sciencetoolbox.org/tag/r and improve on that (improving a single language would also improve everything else).

Does this sounds interesting to you?

szeitlin commented 10 years ago

that is interesting to me!

Jure Triglav wrote:

We've just finished the #CW14 hackathon, building a thing that tries to index as much open scientific software as possible, while also building a list of citations of said software. It builds a database of repository URL -> DOI links, e.g. http://sciencetoolbox.org/tools/65 and builds user profiles (organizations not really supported yet), e.g. http://sciencetoolbox.org/github/najoshi

Right now it's a bit funky, some citations are incorrectly attributed, etc. I'd like to improve that and add more sources which can be consumed daily. The plan we executed for the hack day is here: jure/sciencetoolbox#11 https://github.com/jure/sciencetoolbox/issues/11 and in a lot of ways, this would be a continuation of that work.

It was perhaps a task too big for a one day hackathon, but we did manage to index 1500+ tools and 1600+ citations from Google Scholar and EuropePMC.

Seeing as this is an R hackathon, I could focus on R code, of which there is already quite a few in the index: http://sciencetoolbox.org/tag/r and improve on that (improving a single language would also improve everything else).

Does this sounds interesting to you?

— Reply to this email directly or view it on GitHub https://github.com/ropensci/hackathon/issues/29.

jure commented 10 years ago

I'm not an R developer, but in the context of the above goal and for the purposes of this hackathon I would like to develop a Ruby gem that takes either a CRAN project URL, e.g. http://cran.r-project.org/web/packages/freestats/index.html or the output of citation('freestats') in a simple interface:

require 'codecitations';
CodeCitations.find('http://cran.r-project.org/web/packages/freestats/index.html')
# returns an array of DOIs => 
['10.1079/PNS2005481', '10.7554/eLife.00003']

What do you think?

sckott commented 10 years ago

@jure Interesting - sounds great.

sckott commented 10 years ago

Where are you hacking on this?