Segmentation not working for 2D X-ray mammography DICOMs

KarolBorkowski commented 2 years ago

Generation of Segmentation (SEG) images doesn't work for mammography images due to required tags that are optional and usually not present. In particular: FrameOfReferenceUID, SliceThicknes, ImageOrientationPatient and ImagePositionPatient. I'm attaching an exemplary mammography DICOM file.

Thanks for considering it!

test_dicom.zip

hackermd commented 2 years ago

Thank @KarolBorkowski for reporting the issue and for providing a test file.

You may be able to create a Segmentation instance by providing pixel_measures to the constructor (see also https://github.com/herrmannlab/highdicom/issues/99).

hackermd commented 2 years ago

I am not sure what to do about a missing Frame of Reference UID (the Frame of Reference appears to be conditionally required). @CPBridge do you have any experience with segmentation of X-Ray images?

KarolBorkowski commented 2 years ago

pixel_measures is not enough since the rest of the listed tags is missing too. When I set them to some random values, the instance is generated, but I wonder how are this values used by dicom viewers and if it's safe to use any values.

fedorov commented 2 years ago

I am not sure if anyone used SEG with X-Ray images. It is not impossible that there are some gaps or incorrect assumptions in the standard to support this use case. I want to make sure @dclunie is aware of this conversation to comment on the issues you are observing.

CPBridge commented 2 years ago

I have never tried this myself. I had a very brief initial look into this and I do not yet have reason to believe that there are inconsistencies in the standard itself although it is certainly true that we did not consider the case where the frame of reference uid (and hence the other attributes) are missing when we wrote the segmentation implementation. We can (and should) look into this but it won't be a totally straightforward fix.

I wonder how are this values used by dicom viewers and if it's safe to use any values.

I imagine this is likely to cause issues with viewers, and regardless it would not be a good idea to populate this with random values unless this is in a completely isolated and controlled research setting.

CPBridge commented 2 years ago

I have started a branch to address this issue, but unfortunately it's not so straightforward...

Currently highdicom uses a fixed list of Dimension Index Pointers (that describe the organization of the frames within the segmentation) depending on the coordinate system. For segmentations of images in the PATIENT coordinate system, this is:

SegmentNumber
ImagePositionPatient

For segmentations of image in the SLIDE coordinate system, this is:

SegmentNumber
Row Position in Total Pixel Matrix
Column Position in Total Pixel Matrix
X Offset in Slide Coordinate System
Y Offset in Slide Coordinate System
Z Offset in Slide Coordinate System

When there is no frame of reference UID, and therefore the image does not exist within a defined coordinate system, we cannot use either of these dimension organizations. I'm not sure what to do instead and would appreciate others' input (@hackermd, @fedorov, @dclunie). A couple of thoughts I had:

We just use the SegmentNumber on its own, meaning that the segmentation can only refer to a single XRay image (I believe in practice there are xrays that exist within series with different views per instance but I have not worked much with them in the past).
We use an attribute like InstanceNumber to order the frames in a somewhat intuitive way, but this is somewhat problematic as a type 2 attribute.
Something else I haven't thought of, open to suggestions...

Thanks

hackermd commented 2 years ago

Thanks @CPBridge. I don't really have a lot of experience with handling of X-ray images (or other images that lack a frame of reference), and defer to @fedorov's and @dclunie's expertise. Naively, I would assume that these images are always single-frame images and no additional dimension index would be needed.

fedorov commented 2 years ago

I talked with @dclunie about this today, and he said that the approach you propose Chris is appropriate. Also, attributes describing image geometry and FoR are not required if there is a reference to the source images. I think David was going to follow up in this issue to perhaps clarify this further. I personally don't have experience generating segmentations for X-ray, unfortunately. This use case is not handled by dcmqi. There was a related issue there (see https://github.com/QIICR/dcmqi/issues/319), which was never addressed, and with highdicom available now, there are even fewer reasons to address it, given that no one contributes or maintains dcmqi other than me and I can't make time for this.

CPBridge commented 2 years ago

Thanks @fedorov! I think the parts relating to the geometric attributes are fairly clear in the standard, but any further clarification about the dimension organization would be appreciated. Based on the feedback I have so far I will just go with the Segment Number dimension alone when there is no FrameOfReferenceUID in the input image, and also disallow passing more than one source image in this case.

dclunie commented 2 years ago

Short version: SegmentNumber (at least)

TL;DNR:

From the perspective of the creator:

To establish the mapping between Segmentation objects (SEG) frames and referenced image frames, in the absence of a Frame of Reference (FoR), one needs to specify the frame-specific references in DerivationImageSequence.

It is those frame-specific references that a SEG recipient will use to establish the match between SEG frame and reference image frame, regardless of what is present in the Multi-frame Dimension Module (MFDM).

That said, an MFDM is mandatory in SEG, so that begs the question of what to put in it when there is no FoR and no obvious spatial mapping to use as the basis for defining "dimensions".

There always needs to be at least one specified DimensionIndexPointer, but when used to encode segmentations of single frame projection radiography instances for which there is no FoR, the values will be degenerate, i.e, there will be:

only one index in one dimension if there is only one segment, and
one index per segment if there is more than one segment (which is the only reason why there could be more than one frame in a SEG referencing single frame projection radiography instances).

In the first case, DimensionIndexPointer will point to an attribute whose value could theoretically change but has only one value since there is only one frame in the segmentation instance, like SegmentNumber; the DimensionIndexValues attribute for the single segmentation frame will always have a value of 1 since it is defined to start from 1.

In the second case (more than one segment), pointing to SegmentNumber as a dimension would be necessary, and there is no reason to have more than one dimension.

However ...

There are also some multi-frame image objects in DICOM that could be referenced by a SEG but are not "3D" in the formal sense and which do not have an FoR defined, e.g., multi-frame (not volume) ultrasound, nuclear medicine, X-Ray angiography and fluoroscopy.

In these cases, in addition to there being a SegmentNumber dimension in the SEG (whether degenerate or not), there would also need to be some other dimension corresponding to different frames in the referenced image. The actual attributes in the referenced image have no direct correlate in the Segmentation IOD that describes the SEG, such as a FrameIncrementPointer to things like EnergyWindowVector or TimeSliceVector, and there may be multiple such "dimensions" (not called such) in the referenced image. This is awkward to model (there is a bit of an impedance mismatch here between old and enhanced family approaches to describing multi-frame images), and as a SEG creator one can fudge things by creating Stacks with StackID and InStackPositionNumber used as dimension indices in the SEG, or just use FrameNumber (which is not explicitly forbidden, though probably should be, as it defeats the point).

For some sample use cases, see for example the C.8.4.8 NM Multi-frame Module.

Bottom line is that if you want to be able to create SEGs for multi-frame images that do not have an FoR, you either need to:

have a detailed understanding of the IOD of that multi-frame image and some means to replicate it in the SEG, or
have a "dumb" and generally applicable strategy that encodes a useless but valid additional dimension index beyond SegmentNumber (e.g., by just pointing to FrameNumber, but taking care to make sure the DimensionIndexValue for that dimension starts at 1 and increments by 1 rather than just copying FrameNumber, at least until the standard comes up with a better alternative and/or forbids using FrameNumber like this)
have a slightly less dumb and generally applicable strategy that creates a one or more stacks corresponding to what is in the generic FrameIncrementPointer mechanism without understanding what FrameIncrementPointer actually points to

This is necessary because the "meaningful" dimension descriptive attributes are only present in the referenced image, and not copied into the SEG (e.g., EnergyWindowVector in the NM case).

Be aware that if you are creating an MFDM for a SEG that does have an FoR, you should be wary about reusing any MFDM information in the reference images (vide infra), but you may not have considered that use case yet, there not being too many reference images with an MFDM, with the exception of WSI.

If you are, consider that if you are going to use the same DimensionOrganizationUID and the same DimensionIndexPointer(s), then if only a subset of frames are reused, you cannot start from 1 and increment from 1 within the SEG object - you will need to reuse the DimensionIndexValue(s) of the corresponding frame (tile) in the reference image that has the same DimensionOrganizationUID.

On the other hand you could use the same (nominal) dimensions (same DimensionIndexPointer(s)) but a different (new) DimensionOrganizationUID, and then you could (would be required to) start the DimensionIndexValue(s) from 1 and increment from 1.

From the perspective of the recipient:

The corollary of all this is what a recipient is going to do with the MFDM information in the SEGs that you create, if anything?

Whatever the SEG creator has done with dimensions, to establish the mapping between SEG frames and referenced image frames, in the absence of a Frame of Reference (FoR), the recipient needs to follow the frame-specific references in DerivationImageSequence. The Multi-frame Dimension Module (MFDM) has nothing to do with this.

In the "no FoR" case, since the MFDM is mandatory in SEG, assuming it is correctly constructed, you could just "use it" and ignore what it actually points to. I.e., for each frame nothing matters except the one or more values of DimensionIndexValues for each frame.

That said, for matching SEG to reference images, there is no reason to use it, since there are explicit frame level references defining a 1:1 pixel match.

In the FoR is present (and the same) case:

There may or may not be image-specific or frame-specific references in DerivationImageSequence; if there are, the values in those referenced images/frames for spatial information may not directly correspond to those of the segmentation frames, since they only indicated the frame or image from which the segmentation was derived, not exactly where on that image or frame (i.e., unlike the "no FoR" case, there is not required to be a 1:1 pixel correspondence between SEG and referenced image) A.51.5.1 Segmentation Functional Groups Description:

"When a Frame of Reference UID is present ... Since this defines the spatial relationship of the segment, the size of the segmentation frames need not be the same size, or resolution, as the image data used to generate the segment data. The Derivation Image Functional Group may also be present, to specify on which images the segmentation was actually performed (since there may be others in the same Frame of Reference that are spatially co-located, but were not used to perform the segmentation)"

Since a set of dimensions are defined with their own DimensionOrganizationUID, which does not necessarily have a 1:1 correspondence with a FrameOfReferenceUID, one can only share what they point to as being in the same nominal space and sampled in the same way (in the most general sense) when both the SEG and the reference images have the same DimensionOrganizationUID. In the legacy CT, MR and PET case, of course, there is no DimensionOrganizationUID in the UIDs, so the issue does not arise. Even if the reference images were enhanced family, they could have a different DimensionOrganizationUID even though the same space was sampled the same way, or they could have been sampled differently in which case they would have to have a different DimensionOrganizationUID.

Only if the SEG and the reference images shared the same DimensionOrganizationUID, and it was correctly implemented, might there be some utility to using the DimensionIndexValues to match frames.

Whether or not DerivationImageSequence is present, the SEG contains its own spatial information (Plane Position (Patient), etc.) that should be used directly. Assuming too much from the MFDM (or DerivationImageSequence when an FoR is present) may be dangerous, since the choices in there may differ from what is present in the specific spatial attributes of the corresponding images.

Background

The MFDM is present in enhanced family image IODs (including SEGs) to provide a means for recipients to navigate different dimensions without necessarily "understanding" what those dimensions are. E.g., in the 3D case to scroll through a volume without ever knowing that it is a volume, one can blindly follow the opaque DimensionIndexValue, knowing that they define a particular order of traversal, no more and no less.

If one "understands" what (some or all of) the DimensionIndexPointer attributes "mean" then one can establish (partially or fully) what strategy the SEG creator used to populate them and it may save you having to extract this yourself, but ultimately it is the primary spatial information (Plane Position (Patient), etc.) that is normative, not whatever is described as a "dimension".

E.g., the DimensionIndexPointer may point to StackID and InStackPositionNumber of stacks that replicate what is in Plane Position (Patient) or similar. Or the "stack" concept may define something entirely new, especially when there is more than one stack, such as what subset of space is traversed and in what manner. Even in that case, it is not the navigation information provided in the MFDM that defines how the space was traversed, but the stack information. That stack may be traversed by incremental steps through space in the same direction among parallel slices, or it could be something completely different, like rotation around a line or movement along an arbitrary path. You might well ask why a stack does not have a UID, if it is potentially reusable across instances, which is a valid question and probably deserving of a CP to the standard.

CPBridge commented 2 years ago

This has been resolved in #183

However some of the discussion here may prove very useful in the future

ImagingDataCommons / highdicom

Segmentation not working for 2D X-ray mammography DICOMs #159