bids-standard / bids-matlab

MATLAB / Octave tools for BIDS datasets
https://bids-matlab.readthedocs.io
MIT License
52 stars 32 forks source link

Initialize derivative data set with a copy of the raw data #78

Closed Remi-Gau closed 3 years ago

Remi-Gau commented 4 years ago

There are a few circumstances (see here under) where I would like to initialize the derivatives with a copy, complete or partial, of the original full BIDS dataset. I have been working on a prepare_derivatives.m for that specific purpose with some features, such as:

Why would I want to do that? Well because, it is convenient to work on a subset of data and/or in a "sandpit", for example

Originally posted by @ChristophePhillips in https://github.com/bids-standard/bids-matlab/issues/60#issuecomment-714505816


See comments:


In other repos:

Remi-Gau commented 3 years ago

I have also something similar in our lab pipeline though yours seem to have more features. I also know that spmup by @CPernet has something that does some of that.

So that seems like one the obvious low hanging fruit !!

Remi-Gau commented 3 years ago

Copied from original issue

@ChristophePhillips: spm_copy and spm_mkdir are tools to help you for what you want to do: https://en.wikibooks.org/wiki/SPM/BIDS#Formatting_datasets_into_BIDS that we used here: https://github.com/spm/MultimodalScripts/blob/master/code/scripted/master_script.m#L49-L76

Remi-Gau commented 3 years ago

Actually had opened an issue on one of our repo to get spm_mkdir and spm_copy out of spm. So I would heart to have them as part of bids matlab.

One headache to keep in mind: datasets curated with datalad have their content stored with git annex (I need to finish a PR about that on the datalad handbook). So a simple call to copyfile will not follow the symbolic link and you just end up with just a bunch of broken links.

2 options:

the user must make sure they have run datalad unlock the files to copy
try a system call to cp -L and catch with copyfile if it fails (this is the hacky way of doing things we are currentty using: see here

The second option is going to make Windows users cry though... If they use datalad: no a huge user base at the moment but that too could grow.

Any other ways around this that would make everyone happy?

Remi-Gau commented 3 years ago

Origianlly posted by @gllmflndn

I think we should make a system() call only out of necessity. We could test for symlinks within a isunix condition and only use cp -L for these?