InsightSoftwareConsortium / ITK

Insight Toolkit (ITK) -- Official Repository. ITK builds on a proven, spatially-oriented architecture for processing, segmentation, and registration of scientific images in two, three, or more dimensions.
https://itk.org
Apache License 2.0
1.43k stars 666 forks source link

Behaviour change between ITK 5.3 and 5.4 with DICOM spacing #4794

Open seanm opened 3 months ago

seanm commented 3 months ago

Further to some discussions in https://github.com/InsightSoftwareConsortium/ITK/pull/4647 and https://github.com/InsightSoftwareConsortium/ITK/pull/4521, we still have a situation that isn't quite clear to us, and since those are merged already, we thought we'd create a new ticket.

Basically, there is a change in behaviour between ITK 5.3 and 5.4 that we are trying to understand if it's a bug or deliberate & correct.

If we query a DICOM file for its image spacing information, we get different results depending on:

A small C++ test case is attached: inconsistency_in_spacing.zip

It uses a DICOM file named D_CLUNIE_CT1_J2KR1.dcm, from the GDCM test suite I think, but from David Clunie's collection ultimately I suppose.

In ITK 5.3:

Number of files ImageSeriesReader GDCMImageIO
Single [0.661468, 0.661468, 1] [0.661468, 0.661468, 1]
Multiple [0.661468, 0.661468, 1] [0.661468, 0.661468, 1]

Notice everything agrees: the z spacing is 1.

In ITK 5.4 / master:

Number of files ImageSeriesReader GDCMImageIO
Single [0.661468, 0.661468, 5] [0.661468, 0.661468, 5]
Multiple [0.661468, 0.661468, 1] [0.661468, 0.661468, 5]

Notice now everything does not agree.

The change from 1 to 5 is presumably deliberate from https://github.com/InsightSoftwareConsortium/ITK/pull/4521.

But then shouldn't that lone 1 (in the bottom left) also be 5?

Thanks.

issakomi commented 3 months ago

The behavior change may probably be related to these lines introduced in the commit https://github.com/malaterre/GDCM/commit/836f7a7454ae0edd2713ecb60d66da11349d3a0d. In fact ImageHelper::SecondaryCaptureImagePlaneModule variable seems to affect the several MediaStorage classes above, maybe there should be logic + break before? I don't know. Even if I am on the list of co-authors, I didn't make the change, we started with very simple proposal https://github.com/InsightSoftwareConsortium/ITK/pull/4120/files. If I remember correctly this particular change was done by the GDCM owner, but I am not sure.

BTW, the 'Multiple' series is just duplicated file, both with the same Instance UID (this is invalid) and z-spacing is just a matter of fallback (it is 0 or undefined).

issakomi commented 3 months ago

I mean:

bb

FYI, I will not do any PRs or other actions related to this issue

Edit: if it is still not clear: this looks for me as a fall-through bug in switch, the value of ImageHelper::SecondaryCaptureImagePlaneModule (hard-coded to true in GDCM IO) should not affect other IODs.

issakomi commented 3 months ago

https://discourse.slicer.org/t/regression-in-the-dicom-data-base/37913 may be related

thewtex commented 2 months ago

CC @dclunie

pieper commented 1 month ago

I looked into this some more and I'm thinking now that the new behavior is generally correct.

For the case of the single file, it has a SpacingBetweenSlices defined, which is valid (optional) for a CTImageStorage and with the new code this is not ignored.

(base) pieper@hive clunie_single % dcmdump D_CLUNIE_CT1_J2KR1.dcm| grep SpacingBetween
(0018,0088) DS [5.000000]                               #   8, 1 SpacingBetweenSlices

In the code pointed to by @issakomi it looks to me like all the Media Storage types above the break could validly have a SpacingBetweenSlices (although I didn't check them all) and probably it should be used if it's there, at least for the single slice case. I believe ImageHelper::SecondaryCaptureImagePlaneModule is misleadingly named, but probably has an effect here that is consistent with the standard, even though it's different from what GDCM used to do (we can decide if that's an improvement or a regression).

For the multiple slices it's a little weirder. It seems the test case the positions are the same for the two slices, so probably ImageSeriesReader treats this as a degenerate case and falls back to 1, while GDCMImageIO just reports what's in the file.

(base) pieper@hive clunie_multiple % dcmdump D_CLUNIE_CT1_J2KR1_01.dcm| grep ImagePosition 
(0020,0032) DS [-158.135803\-179.035797\-75.699997]     #  34, 3 ImagePositionPatient
(base) pieper@hive clunie_multiple % dcmdump D_CLUNIE_CT1_J2KR1_02.dcm| grep ImagePosition
(0020,0032) DS [-158.135803\-179.035797\-75.699997]     #  34, 3 ImagePositionPatient
(base) pieper@hive clunie_multiple % dcmdump D_CLUNIE_CT1_J2KR1_01.dcm| grep SpacingBetween
(0018,0088) DS [5.000000]                               #   8, 1 SpacingBetweenSlices
(base) pieper@hive clunie_multiple % dcmdump D_CLUNIE_CT1_J2KR1_02.dcm| grep SpacingBetween
(0018,0088) DS [5.000000]                               #   8, 1 SpacingBetweenSlices

The ITK readers defaulting to a spacing of 1 when the value really should be undefined is probably a bit of legacy behavior we can't easily change now, even if it's confusing since you don't know if 1 is valid or not. Then again, I'm not sure what the correct answer is here: the files are in the same location but claim to be 5mm apart, so it's not clear which part of the file should take precedence.

So while I think the GDCM change is okay, something about the behavior of the series reader is causing a problem in Slicer on some real CT data that is being discussed further here: https://discourse.slicer.org/t/regression-in-the-dicom-data-base/37913/8.

seanm commented 1 month ago

The ITK readers defaulting to a spacing of 1 when the value really should be undefined is probably a bit of legacy behavior we can't easily change now, even if it's confusing since you don't know if 1 is valid or not.

I suppose we could still change this behaviour, using the ITK_LEGACY_REMOVE feature to bring it in slowly...