psychoinformatics-de / datalad-hirni

DataLad extension for (semi-)automated, reproducible processing of (medical/neuro)imaging data
http://datalad.org
Other
5 stars 8 forks source link

Conversion fails without error but no conversion data #192

Open manuelakuhn opened 3 years ago

manuelakuhn commented 3 years ago

When adding the rule template in sourcedata, the conversion runs through without error, but no conversion data is available afterwards. This is triggered to the missing bids-modality entry in the rule template.

$ datalad create -c hirni sourcedata
$ cd sourcedata
$ mkdir code/custom_rules
$ wget -P code/custom_rules https://raw.githubusercontent.com/psychoinformatics-de/datalad-hirni/master/datalad_hirni/resources/rules/custom_rules_template.py
$ cat << EOT >> .datalad/config
[datalad "hirni.dicom2spec"]
        rules = code/custom_rules/custom_rules_template.py
EOT
$ datalad save -m "Add custom rule" --to-git
$ datalad hirni-import-dcm --anon-subject 001 <data_path>/sourcedata.tar.gz acq1

-> runs through without errors or warnings [1]

$ datalad create -c bids bids
$ cd bids
$ datalad install --dataset . --source ../sourcedata sourcedata --recursive
$ datalad hirni-spec2bids --anonymize sourcedata/studyspec.json sourcedata/*/studyspec.json

-> no errors only some warnings about WARNING: Missing specification value for key 'bids-modality' (full output [2]) but result dir not generated:

$ tree -L 1
.
├── dataset_description.json
├── participants.json -> .git/annex/...
├── participants.tsv -> .git/annex/...
├── README
└── sourcedata

1 directory, 4 files

Adding 'bids-modality': None to rule return dict in /tmp/datalad-tests/bids/sourcedata does not change the outcome.

Only after adding one modality for one type of data it works:

1 directory, 2 files


There are multiple problems with this but I guess the biggest one is, that if the rule is missing something, `spec2bids` should not run through without throwing an error and then simply not create a result (warning is not sufficient here). To put that one step further, there could already be an error when the rule is applied, meaning `dicom2spec` or `hirni-import-dcm` are executed. This would make rule development much easier and user friendly.

[1]:
```python
[INFO   ] Creating a new annex repo at /tmp/datalad-tests/sourcedata/acq1/dicoms
[INFO   ] Scanning for unlocked files (this may take some time)
[INFO   ] Adding content of the archive sourcedata.tar.gz into annex AnnexRepo(/tmp/datalad-tests/sourcedata/acq1/dicoms)
[INFO   ] Finished adding sourcedata.tar.gz: Files processed: 1665, +annex: 1665

[2]:

[INFO   ] Running procedure copy-converter
[INFO   ] == Command start (output follows) =====
+ '[' -d sourcedata/README ']'
+ datalad run --explicit --input sourcedata/README --output /tmp/datalad-tests/bids/README cp sourcedata/README /tmp/datalad-tests/bids/README
[INFO   ] Making sure inputs are available (this may take some time)
[INFO   ] == Command start (output follows) =====
[INFO   ] == Command exit (modification check follows) =====
add(ok): README (file)
save(ok): . (dataset)
action summary:
  add (ok: 1)
  get (notneeded: 3)
  save (ok: 1)
[INFO   ] == Command exit (modification check follows) =====
copy-converter(ok): /tmp/datalad-tests/bids/sourcedata/studyspec.json [acquisition converted.]
[INFO   ] Running procedure copy-converter
[INFO   ] == Command start (output follows) =====
+ '[' -d sourcedata/dataset_description.json ']'
+ datalad run --explicit --input sourcedata/dataset_description.json --output /tmp/datalad-tests/bids/dataset_description.json cp sourcedata/dataset_description.json /tmp/datalad-tests/bids/dataset_description.json
[INFO   ] Making sure inputs are available (this may take some time)
[INFO   ] == Command start (output follows) =====
[INFO   ] == Command exit (modification check follows) =====
add(ok): dataset_description.json (file)
save(ok): . (dataset)
action summary:
  add (ok: 1)
  get (notneeded: 3)
  save (ok: 1)
[INFO   ] == Command exit (modification check follows) =====
copy-converter(ok): /tmp/datalad-tests/bids/sourcedata/studyspec.json [acquisition converted.]
spec2bids(ok): /tmp/datalad-tests/bids/sourcedata/studyspec.json
[INFO   ] Running procedure hirni-dicom-converter
[INFO   ] == Command start (output follows) =====
[INFO   ] Making sure inputs are available (this may take some time)
[INFO   ] To obtain some keys we need to fetch an archive of size 1.6 GB
                                                                                                                       [INFO   ] == Command start (output follows) =====
INFO: Running heudiconv version 0.8.0 latest 0.9.0
INFO: Analyzing 1665 dicoms
INFO: Generated sequence info for 1 studies with 16 entries total
INFO: Study session for StudySessionInfo(locator=None, session=None, subject='001')
INFO: Need to process 1 study sessions
INFO: PROCESSING STARTS: {'subject': '001', 'outdir': '/tmp/datalad-tests/bids/.git/hirni-tmp-9iw566eo/', 'session': None}
INFO: Processing 16 pre-sorted seqinfo entries
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
WARNING: Missing specification value for key 'bids-modality'
INFO: Doing conversion using dcm2niix
INFO: Lock 139747288087800 acquired on /tmp/datalad-tests/bids/heudiconv.lock
INFO: Populating template files under /tmp/datalad-tests/bids/
INFO: Lock 139747288087800 released on /tmp/datalad-tests/bids/heudiconv.lock
INFO: PROCESSING DONE: {'subject': '001', 'outdir': '/tmp/datalad-tests/bids/.git/hirni-tmp-9iw566eo/', 'session': None}
[INFO   ] == Command exit (modification check follows) =====
[INFO   ] == Command exit (modification check follows) =====
hirni-dicom-converter(ok): /tmp/datalad-tests/bids/sourcedata/acq1/studyspec.json [acquisition converted.]
spec2bids(ok): /tmp/datalad-tests/bids/sourcedata/acq1/studyspec.json
action summary:
  copy-converter (ok: 2)
  hirni-dicom-converter (ok: 1)
  spec2bids (notneeded: 17, ok: 2)
bpoldrack commented 3 years ago

Two things in here, from my point of view:

  1. Template rule behavior. This is essentially #193
  2. This:

To put that one step further, there could already be an error when the rule is applied, meaning dicom2spec or hirni-import-dcm are executed. This would make rule development much easier and user friendly.

I think the solution to that is a specification "validator" of some kind. A dedicated command, that can be called independently by users but also internally by spec2bids (and may be import-dcm).

Edit: However, such a command can't really cover the BIDS validity of the outcome of a specification before actually executing. Since arbitrary conversion procedures can be included, we can't predict outcome beforehand in a meaningful way. And after execution - well, there already is a BIDS validator. It can cover a few aspects of how a specification may be (technically) invalid, but less so its result.

manuelakuhn commented 3 years ago

I think the solution to that is a specification "validator" of some kind. A dedicated command, that can be called independently by users but also internally by spec2bids (and may be import-dcm).

Exactly what I was thinking :-) I think it is enough to check that hirni works fine with this rule and does not run into errors (e.g. because of a missing key as shown above). If the result after the conversion is bids conform or not that's what the bids-validator is for and I don't know if it is really necessary to have that kind of check, also partially, inside of hirni (overkill?).

Even though with the checks implemented, errors in the rule file will be unlikely, I still think that spec2bids should not silently stop if anything goes wrong with the conversion.

As a side remark: Even though I know that #193 and #194 are connected to this (and I discovered them actually while working on this problem here), I thought that it would make your life easier to have them in separate issues (for me they reflected different types of problems)

bpoldrack commented 3 years ago

I think it is enough to check that hirni works fine with this rule and does not run into errors (e.g. because of a missing key as shown above).

That's where things become a little complicated, though. hirni didn't run into an error. The conversion procedure just didn't produce an output. That procedure happens to be the default, but technically isn't part of hirni but the toolbox. You could use hirni without the toolbox and/or a different conversion procedure for the DICOMs (that may not be based on heudiconv). A specification validator could of course incorporate some knowledge of that default procedure and what it needs, but this needs to be done with awareness that such checks are only valid if the default conversion is actually used.

As a side remark: Even though I know that #193 and #194 are connected to this (and I discovered them actually while working on this problem here), I thought that it would make your life easier to have them in separate issues (for me they reflected different types of problems)

Yes, appreciated! :)