ImagingDataCommons / highdicom

High-level DICOM abstractions for the Python programming language
https://highdicom.readthedocs.io
MIT License
178 stars 37 forks source link

Provide meaningful error message if all Segmentation frames are empty #180

Closed adamvonpaternos closed 2 years ago

adamvonpaternos commented 2 years ago

When using the static method of _omit_empty_frames, if there are no empty frames present (frames without positive pixels), then numpy thows an exception with regard to the empty non_empty_frames list passed into np.stack.

https://github.com/herrmannlab/highdicom/blob/95f39dd722ae6d357af3f942e2130d0ff4d68bfc/src/highdicom/seg/sop.py#L1037-L1078

May I suggest just a check on the array before proceeding?

        if len(non_empty_frames) == 0:
            return (pixel_array, plane_positions, []) 
CPBridge commented 2 years ago

Thanks for filing the bug report, @adamvonpaternos !

Though this seems like a simple bug, it actually gets at a more fundamental point...

If all the frames are empty, and the "omit_empty_frames" option is used, what exactly should the desired behaviour be?

I would argue that we should probably throw a more useful error message and leave it up to the user to check that the segmentation is non-empty before trying to construct a segmentation. I think creating a segmentation with no actual pixel data would be incorrect.

Thoughts @hackermd ?

CPBridge commented 2 years ago

Actually @adamvonpaternos can you please confirm whether you are encountering this issue when there are no empty frames or all empty frames? Your text "if there are no empty frames present" suggests that there are "no empty frames" present, but your solution is a fix for "no non-empty frames" (i.e. all frames are empty).

hackermd commented 2 years ago

I would argue that we should probably throw a more useful error message and leave it up to the user to check that the segmentation is non-empty before trying to construct a segmentation. I think creating a segmentation with no actual pixel data would be incorrect.

I fully agree.

hackermd commented 2 years ago

Actually @adamvonpaternos can you please confirm whether you are encountering this issue when there are no empty frames or all empty frames? Your text "if there are no empty frames present" suggests that there are "no empty frames" present, but your solution is a fix for "no non-empty frames" (i.e. all frames are empty).

That's the conclusion I reached as well.

pieper commented 2 years ago

I think creating a segmentation with no actual pixel data would be incorrect.

I fully agree.

I would think creating an empty segmentation is perfectly valid, meaning that none of the segmented structure was in the image. Like passing an empty list is different than no list at all.

hackermd commented 2 years ago

Good point @pieper. The question arises whether such a segmentation can be encoded sparsely since this would result in a Pixel Data element with an empty value. The value of Number of Frames would have to be zero.

fedorov commented 2 years ago

I think we had this discussion before with participation of @dclunie, and I understand it is not the correct DICOM practice to indicate findings that are not present by communicating empty segments. And I agree with that approach. For example, we probably would not want to expect to see an empty MR series to indicate that certain sequence was not acquired. But I admit this question came up in the context of dcmqi before. Unfortunately, I cannot find those conversations.

CPBridge commented 2 years ago

To be clear, the current behaviour of highdicom when it receives a segmentation pixel array that is all zeros (avoiding the potentially confusing term "empty") is as follows:

I propose to change this to

If people feel strongly, we could also add a check to disallow the second case also and never allow the user to pass a segmentation pixel array that contains only zeros

CPBridge commented 2 years ago

(also, before we decide on an approach, I would like to get confirmation from @adamvonpaternos that we haven't miunderstood what's leading to the problem he's having. It's possible that he's actually uncovered a different bug)

adamvonpaternos commented 2 years ago

Apologies for the delayed response. @CPBridge This is occurring in my use case when the pixel array contains entirely False values, i.e. all of the contained frames do not contain positive pixels. Thank you for the activity on this!

fedorov commented 2 years ago

If people feel strongly, we could also add a check to disallow the second case also and never allow the user to pass a segmentation pixel array that contains only zeros

By design, highdicom is opinionated, so I think this would be completely fine. If someone really wanted to work around this, they would be able to with effort. It is ultimately up to you @CPBridge and @hackermd as gatekeepers of highdicom opinions what you want to do! But it does set a precedent - the decision of communicating or not entities/codes/etc that indicate absence of something will have repercussions in other object types. I encourage you at least to wait for @dclunie to come back from vacation and communicate his perspective on this topic, before you make the decision.

CPBridge commented 2 years ago

My personal leaning is to continue to allow segmentations with no "positive" pixels. The reason being that otherwise it becomes difficult to differentiate between a segmentation that was run and produced no positive output, and simply the lack of a segmentation. If we think in the context of an automated process (my bias of course), I would not want a segmentation process that crashed and therefore produced no output segmentation to be interpreted as the absence of a finding. This becomes especially important when segmentation processes are effectively serving as detectors too, most commonly for abnormalities such as lesions, tumors, strokes, stones, etc. The only way around this I think is to always accompany by an SR, and while I think that is probably a good idea, I imagine many people will not do this.

Anyway, we can leave this issue open to discuss with @dclunie but in the meantime I will create a PR later to improve the obviously bad error message users are currently getting "empty" segmentations.

seandoyle commented 2 years ago

@CPBridge - I agree with continuing to allow segmentations with no 'positive' pixels. As we create pipelines of models that can pass inference results this makes sense. But I do worry about the ambiguity of 'no findings' vs 'internal error'. I wonder if there should be a special code or attribute that indicates 'no findings' for the automated scenario because it's going to be ambiguous and/or open to abuse otherwise.

hackermd commented 2 years ago

Agreed! We need to allow segments with all values being zero.

Whether an "empty" segmentation could be encoded in a sparse fashion is something that probably would require a change of the standard and may cause problems for several libraries and applications.

Part 3 Section C.7.6.6 Multi-frame Module states:

Number of Frames (0028,0008) shall have a value greater than zero

I generally think that segmentation should be encoded as FRACTIONAL, which permits the use of image compression codecs. If all pixels are zero, frames will probably be compressed very efficiently.

hackermd commented 2 years ago

@CPBridge I would suggest not raising an error when omit_empty_frames is True and no non-empty frames exist, but instead log a warning message and encode all frames. If one performs segmentation, one may not know upfront whether the operation will result in any pixel will be assigned a value greater than one and one would thus have to always set omit_empty_frames to False to be on the save side.

fedorov commented 2 years ago

Saving empty segmentations is definitely very convenient from the producer perspective, and makes sense for the consumer when you know exactly what was the experiment and you want a uniform layout of data for the specific task.

But real life is not a well-defined scientific study. In the general case, archive will contain data from different experiments and use cases, and the intent of the standard is to facilitate interaction with such heterogeneous datasets.

Here are some specific examples where I had strong thoughts against empty segments.

I open a study in the viewer, it shows a SEG series. I load that series, but I don't see anything. I spend time really scroll carefully, and I still don't see anything. I start wondering - is this the viewer bug? Is this the tool that wrote those segmentations that messed up? Can I figure out who did it and debug it? In that specific case it took me and OHIF folks some hours to convince ourselves that it is probably really empty. And hopefully it was intended to be empty. But you can't de-cypher the intent from those bits.

Another example. I query my archive for all images that have segmentation of specific type. And now I no longer can trust the metadata the way I expect. I actually have to fetch the entire objects, or implement some other non-standard logic to figure out whether SEG that has a segment called "tumor" actually has segmentation of a tumor or not.

I would never encourage or recommend creation of empty segments.

hackermd commented 2 years ago

@fedorov Your points are well taken and I can see how an "empty" segment may cause confusion in certain situations. However, the same may be true for very small objects, which may also not be readily visible.

I agree with @pieper that the absence of a segment is semantically different from an empty segment. Consider image segmentation for identification of tumor lesions. There may just not be any tumor lesion in the image and the result of the image segmentation operation would be a Boolean array/tensor with all pixels being zero.

fedorov commented 2 years ago

There may just not be any tumor lesion in the image and the result of the image segmentation operation would be a Boolean array/tensor with all pixels being zero.

I understand. But I see more harm than benefit in creating a blank object to indicate that.

Effort to troubleshoot this by the user for the first use case - visualization - will be very significant. In fact, for a non-DICOM-savvy user, there will be no recourse to understand if the problem is in the viewer, in the data, or establish blank segmentation was intended.

For the second use case - searching - there is no standard solution, as far as I can see, to inform metadata search and differentiate blank from non-blank segmentations.

I understand blank segmentation is semantically different from absence of the segment - no argument. But how will you communicate what that semantics is ("that semantics" = "someone tried to segment the thing that this segment is labeled with, but did not find anything like it") in the metadata in a standard manner?

pieper commented 2 years ago

@fedorov it sounds like you are concerned about current OHIF not having features to make it clear that segmentations are empty, and that's a fair issue. But I agree with the sentiment that @CPBridge summarized as the ability to "differentiate between a segmentation that was run and produced no positive output, and simply the lack of a segmentation".

To me, for example, running a tumor segmentation and getting an empty result would be a very positive thing. Running a tumor segmentation and getting no result would be worrisome. The first would have metadata about the source data, the version of the segmentation algorithm, the date it was run, etc. and the result would be a clean bill of health. The second would just be a void of information. I agree we can have other metadata channels to further clarify the semantics, but even so having a blank segmentation result seems like a valid and useful construct.

hackermd commented 2 years ago

Effort to troubleshoot this by the user for the first use case - visualization - will be very significant. In fact, for a non-DICOM-savvy user, there will be no recourse to understand if the problem is in the viewer, in the data, or establish blank segmentation was intended.

This may also be the case in other situations. For example, if there is just one positive pixel or very few, the user may have a hard time seeing that in a viewer, too. Also, if the segmentation is fractional, the values may all be very close to zero, which may also be hard to see (depending on the palette color lookup table).

For the second use case - searching - there is no standard solution, as far as I can see, to inform metadata search and differentiate blank from non-blank segmentations.

Well, we also don't have a solution to determine the size or shape of the segment without analyzing the pixel data and taking measurements. All pixels being zero is just an extreme case.

fedorov commented 2 years ago

To me, for example, running a tumor segmentation and getting an empty result would be a very positive thing. Running a tumor segmentation and getting no result would be worrisome.

Agreed - it must be indicated, but I do not think blank segmentation is the mechanism that should be used to indicate that running an algorithm returned negative result.

There are numerous representations and types of decisions that could be produced by an algorithm. Should we enumerate blanks for all options? What if the algorithm produces RTSTRUCT or point annotation?

There are numerous reasons for the algorithm to produce a blank result. What if the tool failed? What if the input did not meet expectations?

I hope you accept that the semantics of this blank segmentation is undefined.

I believe SR is the right mechanism to precisely communicate the semantics of the result, as alluded to by @CPBridge earlier (how exactly is a separate conversation). I completely share the sentiment that most users would not care less about SR. But this is the case for many other capabilities of the standard. The fact that those containers are unfamiliar, not implemented, confusing, limiting, etc does not prevent us from introducing the right way to encode information. Encouraging users to create blank segments is not the right way.

I remain unconvinced, and this is not an opinion in highdicom that I support. But we all know that we agreed to disagree - and you guys do the heavy lifting with the implementation, so it is your choice. At the same time, if highdicom is successful, I will have to deal with the consequences in IDC. Which is why in my interactions with the users I will keep discouraging from using this approach. I will definitely discuss this with David when he is back, and am ready to revise my perspective based on his comments.

hackermd commented 2 years ago

@fedorov I disagree. If an algorithm failed, then there should be no Segmentation instance. If the algorithm performed segmentation, then any segment that was evaluated should be included in the Segmentation instance, independent of whether any pixel is positive or not. The absence of a segment is semantically very different from an "emtpy" or "blank" segment.

hackermd commented 2 years ago

@fedorov If I understand correctly, then it is the challenge to distinguish between different instances of the same segment type that you are concerned about. I agree with you that the current Segmentation IOD is not ideal for instance segmentation and we should either improve the existing Segmentation IOD (see #184) or define a new Instance Segmentation IOD. cc @dclunie

CPBridge commented 2 years ago

Resolved with #181