psychoinformatics-de / datalad-hirni

DataLad extension for (semi-)automated, reproducible processing of (medical/neuro)imaging data
http://datalad.org
Other
5 stars 8 forks source link

dicom2spec result not deterministic #191

Closed manuelakuhn closed 3 years ago

manuelakuhn commented 3 years ago

The studyspec.json file differ depending if a rule is registered before or after data is imported

Creating a dataset where the rule is registered before the import

$ datalad create -c hirni register_before_import
$ cd register_before_import
$ mkdir code/custom_rules
$ wget -P code/custom_rules https://raw.githubusercontent.com/psychoinformatics-de/datalad-hirni/master/datalad_hirni/resources/rules/custom_rules_template.py
$ cat << EOT >> .datalad/config
[datalad "hirni.dicom2spec"]
        rules = code/custom_rules/custom_rules_template.py
EOT
$ datalad hirni-import-dcm --anon-subject 001 https://github.com/datalad/example-dicom-structural/archive/master.tar.gz acq1
$ datalad hirni-dicom2spec -s acq1/studyspec.json acq1/dicoms
$ cd ..

Creating a dataset where the rule is registered after the import

$ datalad create -c hirni register_after_import
$ cd register_after_import
$ datalad hirni-import-dcm --anon-subject 001 https://github.com/datalad/example-dicom-structural/archive/master.tar.gz acq1
$ mkdir code/custom_rules
$ wget -P code/custom_rules https://raw.githubusercontent.com/psychoinformatics-de/datalad-hirni/master/datalad_hirni/resources/rules/custom_rules_template.py
$ cat << EOT >> .datalad/config
[datalad "hirni.dicom2spec"]
        rules = code/custom_rules/custom_rules_template.py
EOT
$ datalad hirni-dicom2spec -s acq1/studyspec.json acq1/dicoms
$ cd ..

Now the content of both studyspecs differs:

$ cat register_before_import/acq1/studyspec.json
{"anon-subject":{"approved":false,"value":"001"},"bids-session":{"approved":false,"value":null},"comment":{"approved":false,"value":"I actually have no clue"},"dataset-id":"6f530c04-1e18-4e2c-9f79-b82b1617f01f","dataset-refcommit":"6e990b833e0b0565f091a33398b10b0932ac8745","description":{"approved":false,"value":"anat-T1w"},"location":"dicoms","procedures":[{"on-anonymize":{"approved":false,"value":false},"procedure-call":{"approved":false,"value":null},"procedure-name":{"approved":false,"value":"hirni-dicom-converter"}}],"subject":{"approved":false,"value":"02"},"type":"dicomseries:all"}
{"anon-subject":{"approved":false,"value":null},"bids-session":{"approved":false,"value":null},"comment":{"approved":false,"value":"I actually have no clue"},"dataset-id":"6f530c04-1e18-4e2c-9f79-b82b1617f01f","dataset-refcommit":"6e990b833e0b0565f091a33398b10b0932ac8745","description":{"approved":false,"value":"anat-T1w"},"location":"dicoms","subject":{"approved":false,"value":"02"},"tags":[],"type":"dicomseries","uid":"1.2.826.0.1.3680043.2.1143.515404396022363061013111326823367652"}
$ cat register_after_import/acq1/studyspec.json
{"anon-subject":{"approved":false,"value":"001"},"bids-acquisition":{"approved":false,"value":null},"bids-contrast-enhancement":{"approved":false,"value":null},"bids-direction":{"approved":false,"value":null},"bids-echo":{"approved":false,"value":null},"bids-modality":{"approved":false,"value":"t1w"},"bids-reconstruction-algorithm":{"approved":false,"value":null},"bids-run":{"approved":false,"value":"1"},"bids-session":{"approved":false,"value":null},"bids-task":{"approved":false,"value":null},"comment":{"approved":false,"value":""},"dataset-id":"537143ad-a2cf-48e8-838a-9131e0e4379c","dataset-refcommit":"65cad7ed523a1cc524ca20bd52c20ebaa46ae76c","description":{"approved":false,"value":"anat-T1w"},"id":{"approved":false,"value":401},"location":"dicoms","procedures":[{"on-anonymize":{"approved":false,"value":false},"procedure-call":{"approved":false,"value":null},"procedure-name":{"approved":false,"value":"hirni-dicom-converter"}}],"subject":{"approved":false,"value":"02"},"type":"dicomseries:all"}
{"anon-subject":{"approved":false,"value":null},"bids-acquisition":{"approved":false,"value":null},"bids-contrast-enhancement":{"approved":false,"value":null},"bids-direction":{"approved":false,"value":null},"bids-echo":{"approved":false,"value":null},"bids-modality":{"approved":false,"value":"t1w"},"bids-reconstruction-algorithm":{"approved":false,"value":null},"bids-run":{"approved":false,"value":"1"},"bids-session":{"approved":false,"value":null},"bids-task":{"approved":false,"value":null},"comment":{"approved":false,"value":"I actually have no clue"},"dataset-id":"537143ad-a2cf-48e8-838a-9131e0e4379c","dataset-refcommit":"65cad7ed523a1cc524ca20bd52c20ebaa46ae76c","description":{"approved":false,"value":"anat-T1w"},"id":{"approved":false,"value":401},"location":"dicoms","subject":{"approved":false,"value":"02"},"tags":[],"type":"dicomseries","uid":"1.2.826.0.1.3680043.2.1143.515404396022363061013111326823367652"}

to give a few examples: register_after_import contains keys that are not present in register_before_import, for example for entry anat-T1w:

The reason is, that if no rule was registered before importing data, the default heuristics are used and otherwise the defaults are ignored. Even though this should not change the conversion result, it is confusing.

Although not perfect, but an easy solution could be to execute the default heuristic always first.

bpoldrack commented 3 years ago

The reason is, that if no rule was registered before importing data, the default heuristics are used and otherwise the defaults are ignored. Even though this should not change the conversion result, it is confusing.

Correct.

Although not perfect, but an easy solution could be to execute the default heuristic always first.

Yes, I agree. What I plan to do is: Make the default rule an actual, independent rule, that can be configured (and is configured by default). Since one can already configure several rules to be applied subsequently, that would by default result in the behavior you suggested, but would also allow to easily remove the default rules if desired. Apart from some minor code changes, that will require to "install" rules delivered with hirni into some dedicated location.

Minor comment in case that's not entirely clear: The call to hirni-dicom2spec after hirni-import-dcm in the "before case" is superfluous. The import command calls dicom2spec internally at the end.

bpoldrack commented 3 years ago

In my mind, this boils down to #194, so I will close here. Feel free to reopen, if you think there's more to it, @manuelakuhn