psychoinformatics-de / datalad-hirni

DataLad extension for (semi-)automated, reproducible processing of (medical/neuro)imaging data
http://datalad.org
Other
5 stars 8 forks source link

FR: use PBS-runner for conversion if available #146

Open pvavra opened 4 years ago

pvavra commented 4 years ago

This might be a bigger feature request: Let hirni-spec2bids conversions run in parallel on a cluster, if the latter is available.

My current understanding of the code is the following: datalad has a global option to use condor, via --pbs-runner condor. However, that simply wraps whatever datalad command was called into a .submit file, and adds that to the job cue. So, running hirni-spec2bids on a large set of spec-files would still be run as a single job.

However, it would be good to have a simple interface to submit each spec-processing step as a separate job.

Now, I can make that happen by writing a script which calls the datalad --pbr-runner condor hirni-spec2bids .. for each studyspec.json file, but this feels like an unnecessary workaround.

Instead, I propose the following solution: add a datalad.hirni.pbs-runner config file setting, which when set to condor does this "wrapping" automatically for each studyspec file. Having a separate setting like that would not interfere with any manual datalad .. calls, so that one can still call e.g. datalad run locally. (the docs of datalad mention that there should be a config setting for pbs-runner, but I haven't tried it out..)

This would (later) allow some logic/heuristic to be included, to, for example, run on a cluster only if the studyspec includes conversions of dicoms, but not run on the cluster if only text files need to be handled. This heuristic could be made customizable in even future steps.