carpenterlab / open-science-rules

Collaboratively written manuscript discussing Ten Simple Rules for Enabling Open Science in Biomedical Research
Other
5 stars 2 forks source link

Use version control for software and data #3

Open gwaybio opened 5 years ago

gwaybio commented 5 years ago

One of the more obvious technical rules (even the low hanging fruit has to be grabbed! 🍌 )

Services like:

provide a nice framework that can easily (after some technical training) enable effective version control for software.

Version control of data is equally important. The resources I am aware of that can do these things (in addition to those above are):

Benefit

Reproduce results, improves sharing and modification, can track how updated data impacts results... opening discussion below

allaway commented 5 years ago

Re data versioning: I'm far from unbiased but I think that Synapse is a great system for versioning and provenancing data too. :)

I have a list of other similar services somewhere, I'll dig it up if I can find it.

allaway commented 5 years ago

This is probably more of a software versioning idea: Use of CWL/WDL or other analysis workflows in tandem with dockerized or other containerized software is a time-intensive but robust way to enable reproducible science, but also to help others to employ your methodology on other data.

allaway commented 5 years ago

Here's the list. It was compiled by the Harvard Dataverse folks. https://docs.google.com/spreadsheets/d/1KptHzDHIdB3s1v5m1mMwphcwXhOVWdkRYdjEWW1dqrE/edit#gid=2016420688

WRT that- OSF might be better considered a Data Versioning tool.

Edit - linking the blog post for citation purposes.

allaway commented 5 years ago

Here is a really neat project for software (or perhaps more accurately - analysis) versioning and portability that I just learned about this afternoon: http://boutiques.github.io/