psychoinformatics-de / datalad-hirni

DataLad extension for (semi-)automated, reproducible processing of (medical/neuro)imaging data
http://datalad.org
Other
5 stars 8 forks source link

add docs: how to add custom rules for dicom import #137

Open pvavra opened 4 years ago

pvavra commented 4 years ago

In our dicom-tarball, there are 21 different dicom series, but the current heuristic does a relatively poor job at guessing most of the studyspec fields (studyspec.zip).

The docs do not really specify how to define and "load" custom rules for the import step.

I found the custom rules template and a reference to the config option to link to a custom one, but I do not understand how to adapt the template and actually load it. Further, there are several datalad datasets involved in an dicom-to-bids workflow (bids, sourcedata, hirni-toolbox) - in which .datalad/config should the custom rule be referenced?

Seems related to #74.

bpoldrack commented 4 years ago

I would suggest to use .datalad/config of sourcedata (that is: the study- or raw-dataset hirni's docs are referring to.) However, several configs are considered and can take precedence over each other (see below).

I'll try to quickly get you started and agree that I need to write that down more extensively and carefully:

Along with the template you should find test_rules, that are directly derived from that template as simple examples. Generally, such a rule class has two important spots. Its __init__ and its __call__ method. As you'd guess __init__ is called only once for possible preparations while __call__ is responsible for actually applying the "rule". It does so by returning a dictionary, that is supposed to end up in the specification. That method will be called on each dicomseries that is found in the metadata and should return a spec dictionary for it. Note, that several rules can be applied and might (partially) overwrite each other. The dictionary itself should be easy to understand if you understand those studyspecs. Except may be for the subject entry. There's subject and possibly anon-subject, but no bids-subject, although in conversion routines you'd refer to bids-subject. This can be confusing. bids-subject exists only "virtually" and is generated on-the-fly during conversion. This is because it will either refer to the spec's subject or its anon-subject field, depending on whether the conversion was called with --anonymize.

As to "loading a rule". This solely done via that config. It is supposed to point to such a rule file, and that file itself has module-level attribute declaring a particular class to be "the rule". That way you can have complex code with several classes in that file. This config can be specified several times and at several levels (like any git/datalad config). Order matters. First system-level (that's /etc/.gitconfig by default), then user-level (~/.gitconfig), local (.git/config) and dataset (.datalad/config). That way, rules more specific to what you're currently doing (potentially) overwrite more general rules. Within those config level order is just as stored in those files. dicom2spec (which is called internally by import-dcm) will look for those configs and thereby retrieving a list of rules to apply (and an order in which to do so).

Hope that gives you an idea for now.

pvavra commented 4 years ago

@bpoldrack Thanks for the explanation! Now we need to move it to the docs ;-)

I've created a new commit in the hirni_addons: which adds custom rules, adds a procedure to add that to the datasets' config and it should be working :)