Where should corrections to the datasets be done?

I think these should rather be seen as different states of the same dataset (i.e. same git repo), which has multiple siblings (i.e. git remotes).

The original dataset is cloned as a subdataset into the multi-echo superdataset (here into the raw subdirectory). Then some modifications have to be made via code(in this case 4D to 3D), which results in a new commit (after datalad save) of both the subdataset and the superdataset (the latter being the subdataset version bump). At this stage a GIN sibling can be added for the dataset (which was originally cloned from its github sibling) and the updated state can be pushed there. There is always the possibility to set clone candidate priorities of subdatasets (see: https://handbook.datalad.org/en/latest/beyond_basics/101-148-clonepriority.html), e.g. if we always want people to automatically clone from GIN when obtaining this dataset through the multi-echo superdataset.

It's always good to keep provenance, so I think running the 4D to 3D conversion code via datalad run command is a good idea. I didn't do this when I created the derivatives subdatasets from their file manifests, but it should be easy to redo.

jsheunis / multi-echo-super

Where should corrections to the datasets be done? #4