psychoinformatics-de / datalad-hirni

DataLad extension for (semi-)automated, reproducible processing of (medical/neuro)imaging data
http://datalad.org
Other
5 stars 8 forks source link

loading custom_rules doesn't work from outside dataset #143

Open pvavra opened 4 years ago

pvavra commented 4 years ago

Not sure I'm understanding this correctly and it is a bug, or I misunderstood something.

I've got the following setup:

bids
├── code
├── sourcedata
│   ├── acq2
│   ├── code
│   │   ├── hirni_addons
│   │   ├── hirni-toolbox
│   │   └── __pycache__
│   ├── some_sub_day1_mri

where both bids and sourcedata are datalad datasets (in addition to both hirni_addons, and hirni-toolbox).

in sourcedata/.datalad/config, I've specified to use a custom_rules.py which is located in sourcedata/code/. Now, given that this definition is inside the dataset sourcedata, the definition is simply:

[datalad "hirni.dicom2spec"]
    rules = "code/custom_rules.py"

And running, for example, datalad hirni-dicom2spec -d . [...] from within the sourcedata folder works just fine.

However, my understanding is that the following should also work, but it doesn't:

cd ~/scratch/bids
datalad hirni-dicom2spec -d sourcedata [...]
[WARNING] Ignored invalid path for dicom2spec rules definition: code/custom_rules.py 
dicom2spec(ok): .gitattributes (file)                                                                                                                                                         
dicom2spec(ok):  some_sub_day1_mri/studyspec.json (file)
action summary:
  dicom2spec (ok: 2)

That is, I'm explicitly referencing which dataset is to be used and hence the path should work, I think. Maybe around lines 59f it should take into account the path of the dataset?

bpoldrack commented 4 years ago

Not sure I'm understanding this correctly and it is a bug, or I misunderstood something.

Kinda both. ;-) The code doesn't distinguish where the config comes from. A relative path at a user-level config (~/.gitconfig) should have different reference point than one that is coming from within a dataset (But what would that reference point be?). If treated uniformly (and that's currently the case) only absolute paths work. But you're right. Relative paths should work at least at the dataset level and there it would be consistent to refer to the dataset's root. There's no point in committing absolute paths after all. The way datalad's configs work that will be a bit trickier to address, but you're absolutely right.

pvavra commented 4 years ago

ok, I think I'm starting to understand the "inheritance" of configs.

I agree that relative paths should stay relative to the datasets they come from.

Maybe the same mechanics as for run-procedure could be used? Those are able to handle different locations/definitions. But haven't looked at that code (yet)..