psychoinformatics-de / datalad-hirni

DataLad extension for (semi-)automated, reproducible processing of (medical/neuro)imaging data
http://datalad.org
Other
5 stars 8 forks source link

Manual meta-aggregate needed for dicom conversion #190

Open manuelakuhn opened 3 years ago

manuelakuhn commented 3 years ago

I am trying to add a rule file to be used in the bids conversion. When I do that with a installed test dataset recreating the spec files does not work:

$ datalad create -c bids bids
$ cd bids
$ datalad install --dataset . --source https://github.com/psychoinformatics-de/hirni-demo sourcedata --recursive
$ cd sourcedata
$ mkdir code/custom_rules
$ wget -P code/custom_rules https://raw.githubusercontent.com/psychoinformatics-de/datalad-hirni/master/datalad_hirni/resources/rules/custom_rules_template.py
$ cat << EOT >> .datalad/config
[datalad "hirni.dicom2spec"]
        rules = code/custom_rules/custom_rules_template.py
EOT
$ datalad save -m "Add custom rule"
$ datalad hirni-dicom2spec -s acq1/studyspec.json acq1/dicoms
[WARNING] found no DICOM metadata [dicom2spec(['/tmp/datalad-tests/bids/sourcedata/acq1/dicoms'])]
dicom2spec(impossible): ['/tmp/datalad-tests/bids/sourcedata/acq1/dicoms'] (file) [found no DICOM metadata]
action summary:
  dicom2spec (impossible: 1, notneeded: 1)

After some struggling around I found that using

$ datalad meta-aggregate -r

fixes it.

For me having yet another step to execute (and first have to find out what to do) does not seem very user friendly. If dicom2specs does not work without proper meta aggregation couldn't this be included automatically? For example instead of throwing the error message try a meta-aggregate first?

manuelakuhn commented 3 years ago

Ok, I tried to reproduce the problem again and ran into another one. The solution was not so simple as I first thought. It seems that in my original try, while I struggled to find a solution, I also did a bids conversion. Trying to reproduce this was kind of painful, but I could finally break it down to the following:

working solution:

$ datalad create -c bids bids
$ cd bids
$ datalad install --dataset . --source https://github.com/psychoinformatics-de/hirni-demo sourcedata --recursive
$ datalad hirni-spec2bids --anonymize sourcedata/studyspec.json sourcedata/*/studyspec.json
$ cd sourcedata
$ datalad meta-aggregate -r
$ datalad hirni-dicom2spec -s acq1/studyspec.json acq1/dicoms

but doing an meta-aggregate in the wrong order breaks it:

$ datalad create -c bids bids
$ cd bids
$ datalad install --dataset . --source https://github.com/psychoinformatics-de/hirni-demo sourcedata --recursive
$ cd sourcedata
$ datalad meta-aggregate -r # thows error [1]
$ datalad hirni-dicom2spec -s acq1/studyspec.json acq1/dicoms # -> does not work [2]
$ cd ..
$ datalad hirni-spec2bids --anonymize sourcedata/studyspec.json sourcedata/*/studyspec.json
$ datalad hirni-dicom2spec -s acq1/studyspec.json acq1/dicoms # -> does not work
$ datalad meta-aggregate -r # -> no error anymore
$ datalad hirni-dicom2spec -s acq1/studyspec.json acq1/dicoms # -> does not work

I did not manage to make this work anymore.

So in addition to the usability problem described in the original post calling meta-aggregate in the "wrong" order can also break things.

[1]. the error for meta-aggregate was this one:

[ERROR  ] Failed to get Dataset(/tmp/datalad-tests/bids/sourcedata/acq1/dicoms) metadata (dicom): [Errno 2] No such file or directory: '/tmp/datalad-tests/bids/sourcedata/acq1/dicoms/example-dicom-structural-master/.datalad/config' [filereader.py:dcmread:861] [meta_extract(/tmp/datalad-tests/bids/sourcedata/acq1/dicoms)]                                      
meta_extract(error): . (dataset) [Failed to get Dataset(/tmp/datalad-tests/bids/sourcedata/acq1/dicoms) metadata (dicom): [Errno 2] No such file or directory: '/tmp/datalad-tests/bids/sourcedata/acq1/dicoms/example-dicom-structural-master/.datalad/config' [filereader.py:dcmread:861]]                                                                            
[ERROR  ] Failed to get Dataset(/tmp/datalad-tests/bids/sourcedata/acq2/dicoms) metadata (dicom): [Errno 2] No such file or directory: '/tmp/datalad-tests/bids/sourcedata/acq2/dicoms/example-dicom-functional-master/.datalad/config' [filereader.py:dcmread:861] [meta_extract(/tmp/datalad-tests/bids/sourcedata/acq2/dicoms)]                                      
meta_extract(error): . (dataset) [Failed to get Dataset(/tmp/datalad-tests/bids/sourcedata/acq2/dicoms) metadata (dicom): [Errno 2] No such file or directory: '/tmp/datalad-tests/bids/sourcedata/acq2/dicoms/example-dicom-functional-master/.datalad/config' [filereader.py:dcmread:861]]
delete(ok): .datalad/metadata/objects/87/c6e7fcb632bcc497e3c0993115eb11a89ce56f.xz (file)                               
add(ok): .datalad/metadata/objects/09/0fa1eb66aa94c55260f2e03205194688dddbdb.xz (file)                                  
add(ok): .datalad/metadata/objects/63/85049ff34a3ccfb57fde88b7ee4ee5e958f514.xz (file)
add(ok): .datalad/metadata/objects/7f/d192969367dfc789d7db4e4a7006c68aa955fb.xz (file)                                  
add(ok): .datalad/metadata/objects/84/7008010f508d212e7bb71637c7f2cd322eff3d.xz (file)
add(ok): .datalad/metadata/objects/95/5d72036d2d7e6d7f92f081d89e8f32cc4423e5.xz (file)
add(ok): .datalad/metadata/objects/bd/cbc3c6c5ad689806e7196bbddc3a192285f194.xz (file)
add(ok): .datalad/metadata/objects/d6/bdaef19c45716bebb17966e7216ccd1b524da0.xz (file)
add(ok): .datalad/metadata/aggregate_v1.json (file)
save(ok): . (dataset)
meta_aggregate(ok): /tmp/datalad-tests/bids/sourcedata (dataset)
action summary:     
  add (ok: 8)       
  delete (ok: 1)    
  meta_aggregate (ok: 1)
  meta_extract (error: 2)
  save (ok: 1)
                    %                                                                     

[2] output of dicom2spec:

[WARNING] found no DICOM metadata [dicom2spec(['/tmp/datalad-tests/bids/sourcedata/acq1/dicoms'])]
dicom2spec(impossible): ['/tmp/datalad-tests/bids/sourcedata/acq1/dicoms'] (file) [found no DICOM metadata]
action summary:
  dicom2spec (impossible: 1, notneeded: 1)
bpoldrack commented 3 years ago

For me having yet another step to execute (and first have to find out what to do) does not seem very user friendly. If dicom2specs does not work without proper meta aggregation couldn't this be included automatically?

Indeed. The metadata aggregation is done by import-dcm and should be included in the hirni-demo dataset you installed. However, it appears to not be. Probably screwed that up with an update. Will fix that.

For example instead of throwing the error message try a meta-aggregate first?

I'd rather not go down that path. At least in the datalad-metalad version hirni is using, this could also trigger metadata extraction which can be very expensive. There may be another reason for missing metadata and I wouldn't enforce an automatic potentially expensive operation.

FTR: For now I don't see how the spec2bids call in your working solution is necessary or helpful. Not saying you're wrong - just wondering what exactly this is a side effect of.

So, from my point of view, the failures seem okay but the demo dataset isn't. Will double-check when I fixed that one.

manuelakuhn commented 3 years ago

I found this problem because I was trying to reproduce another one and to make it easier for you to reproduce my steps I switched to your test dataset. I did not test this again with another dataset (was way to frustrated after spending to much time :-)). So if you say this is only a problem with the hirni-demo dataset and that if you fix the dataset, the need to meta-aggregate goes away, that is fine then.

But generally: isn't it still problematic that things break depending on the order of the calls as shown above?

FTR: For now I don't see how the spec2bids call in your working solution is necessary or helpful. Not saying you're wrong - just wondering what exactly this is a side effect of.

That is why I was struggling for so long. I do not understand why it is needed here and it also does not really make sense from a logical point of few. But without it, it simply does not work and meta-aggregate runs into an error (see [1]).

bpoldrack commented 3 years ago

But generally: isn't it still problematic that things break depending on the order of the calls as shown above?

Yes, will have a closer look. However, spec2bids isn't generally needed for aggregation. It just does something internally that as a sideeffect solves an issue in that particular case.