neuropoly / data-management

Repo that deals with datalad aspects for internal use
4 stars 0 forks source link

Investigate DVC #78

Open kousu opened 3 years ago

kousu commented 3 years ago

As an alternative to: #68, dvc (suggested here) is meant to address reproducible science workflows.

We need to get hands-on with it to understand what it does and doesn't cover.

kousu commented 3 years ago

So far, my impression is it's largely an alternative to datalad.

Both are built on git, both have a run command. Both have the basics of a dataflow programming system: dvc repro and datalad rerun. dvc has its own definitions of remote storage, covering the same basic territory as datalad/git-annex special remotes -- with these special remotes notably only being loosely coupled to git version/integrity tracking.