Open neurorepro opened 1 year ago
You may want to consider #682 first (changes are included in this PR)
Attention: 5 lines
in your changes are missing coverage. Please review.
Comparison is base (
e3c53b8
) 81.94% compared to head (e9c9289
) 81.98%. Report is 43 commits behind head on master.
Files | Patch % | Lines |
---|---|---|
heudiconv/utils.py | 73.68% | 5 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@yarikoptic As discussed in #624 I implemented a hash to shorten the name, but the full Series UID could be used too
In general I think the approach makes sense in general. Re why not before: I guess (@satra and @mih might correct if have any recoll of the motivations) is that series id and name (instead of UID) were chosen for human accessibility since they make it easier to understand which particular series in question while annotating .edit.txt
and just debugging the operation. So far I do not see immediate problem with adding series UID since AFAIK indeed it should be the same for all files within a single acquisition. Might be worth checking though on more of combined ones like T1PD etc.
TL;DR: we need to approach it with caution and thus need more checking/work. Some reservations/concerns and thinking out loud:
-g all
or -g accession
i.e. where we force multiple scanning sessions into a single conversion study. That would partially explain why we have not ran into it I guess. Such grouppings were added only few years back IIRC and aren't default mode of operationdcm2niix
would still do "the right thing" and convert those incorrectly "groupped" files into different files since it would rely on series UUID. So it is just that neither a heuristic nor heudiconv would know how to tease them apart.heudiconv
to operate on prior produced annotations in -c none
mode for the subsequent conversion with -c dcm2niix
since all groups names would change. Needs a check:heudiconv
needs to become smart and detect in the mode of work while reloading already existing mapping file and then not bother adding UID? might help in transition period but generally not needed/too cumbersome... may be we would just add a check while loading an existing mapping that all the series ids do include that UID hash and error out if they don't with informative message. I think the latter is better.-g all
or -g accession
(I don't like that). But then it adds differences in behavior based on CLI options, not good@yarikoptic thank you very much for the feedback
Regarding scope of use case: i think it is actually common to have artificially split a single BIDS session into two when the scanning protocol is too long (e.g. AM: anat, resting state, ... PM: diffusion MRI, ...) or the subject needs to exit. It is actually mentioned explicitely in the BIDS definition of session (def number 5):
if a subject has to leave the scanner room and then be re-positioned on the scanner bed, the set of MRI acquisitions will still be considered as a session and match sessions acquired in other subjects. Similarly, in situations where different data types are obtained over several visits (for example fMRI on one day followed by DWI the day after) those can be grouped in one session.
The fact that not many people reported it i think is because users did not know what went wrong and would not think this may be due to heudiconv (who would think of blaming such a magnificent tool ?)
Regarding change of behavior i totally agree that this should be done carefully. I think what seems to be your preferred option would be best:
may be we would just add a check while loading an existing mapping that all the series ids do include that UID hash and error out if they don't with informative message. I think the latter is better.
I will update the PR with that implemented.
If the error is too much for users (for which the use case of needing to keep previous data without uid is justified) then maybe we could add later a temporary option --with-deprecated-series-id
?
@yarikoptic I modified the code to check that the conversion table does not use deprecated series IDs. Otherwise it recomputes the conversation table with the non-deprecated series IDs.
I think in the end this may be less frustrating than an error since it identifies the deprecated IDs and recomputes the conversion table accordingly.
I also set the hash length to 8 as you suggested.
Hello @yarikoptic does it look like the code is ok now ? Cheers !
@neurorepro ping on above comments.
Yes totally for the tests. I actually tested it on two datasets (already having old data, and newly converted) and it went fine. But indeed tests need to be included for continuous testing.
I'll get back to you!
a gentle ping
This is to address #624