datalad / datalad-ukbiobank

Resources for working with UKBiobank as a DataLad dataset
MIT License
6 stars 12 forks source link

Implement name prefix support for managed branches? #74

Open mih opened 3 years ago

mih commented 3 years ago

Two sites (on one application) might have different subsets of data (storage limits, interest differences), but still might want to collaborate on the same participant datasets. ATM the implementation only supports full updates, i.e. only what is available at the time of an update will make it into the managed branches.

Supporting proper incremental updates is not trivial, as it would involve determining which files came from with download and selectively maintain those that existed before.

A cheaper approach might be to add support for a branch-name prefix. Each site would have their own incoming-*, and the union of the site contributions could be achieved by merging both incoming-* branch from each site into a mainline branch, whenever an update was made.

Ping @Hoda1394