Open riddlet opened 5 years ago
Interesting. Last year at Neurohackademy a bunch of us worked on a related project - you might find the repo useful at some point.
Hello! Day one we're on the top floor of the Mercado Centrale, sitting at one of the center tables. =)
Survey of github repos in pubmed
Project Description
Github is increasingly becoming a tool of choice for computational-oriented research. There have been plenty of efforts to get scientists to make use of this valuable tool, but to my awareness there hasn't been any attempt to evaluate how scientists use this resource.
I pulled full-texts from pubmed and searched them for the presence of the string 'github' and found a bit over 20k papers that contained this string. Using these texts plus the github api I think we could provide some insight to the following:
1) How many scientific repos contain a README? 2) How often do often to repos contain files that are likely to be data (e.g. csv, json) 3) What are the most popular types of analytic scripts (e.g. .py, .R, .ipynb, etc)? 4) How do the above vary by research area?
Skills required to participate
ideally, experience with python or R & web-based APIs. Text analysis experience would be helpful.
Integration
TBA
Preparation material
BioC API
github API
Link to your GitHub repo
https://github.com/riddlet/gitpubs
Communication
We have a mattermost channel at https://mattermost.brainhack.org/brainhack/channels/gitpubs