dcmjs-org / dcmjs

Javascript implementation of DICOM manipulation
https://dcmjs.netlify.com/
MIT License
291 stars 110 forks source link

fix: 🐛 IDC1346, properly detect binary segmentations declared as fractional #309

Open Punzo opened 1 year ago

Punzo commented 1 year ago

this fix https://github.com/OHIF/Viewers/issues/1346.

https://viewer.imaging.datacommons.cancer.gov/viewer/1.3.6.1.4.1.14519.5.2.1.7695.4164.330974290504454152904316943429 has two segmentations: 1) Screenshot from 2022-09-14 13-59-19

this one was actually a binary seg (but the DICOM multiframe.SegmentationType is set to fractional). The seg array values are [0,1] (note: the multiframe.MaximumFractionalValue is 255).

There was already a check for this case (binary seg, but defiined as fractional), but the check was checking if the array values were either [0,multiframe.MaximumFractionalValue]. Now I changed in a way that it will check if the values are [0, firstNonZeroValue]. In this case the seg will be loaded as a normal binary.

2) Screenshot from 2022-09-14 13-59-37

here the seg has various values in the array (something like: [0, 12, 33, 48]). For this case, I added a flag (default to true) for converting the seg into a binary one. The conversion is done with a simple thresholding where I use the maximum value found in the array (multiframe.MaximumFractionalValue does not look be reliable number, at least for IDC datasets) and I multiply it for a thesholdingParameter (80% in default).

@pieper can you please review? thanks!

Punzo commented 1 year ago

@pieper some tests are failing, but they not related to this PR, and they fail on master branch too. It looks like first detect was at commit https://github.com/dcmjs-org/dcmjs/commit/42a7ca7d007ef12a0f287d9b23771c1516fab6ab

pieper commented 1 year ago

@Punzo thanks for digging into this 👍

Do we know what the segmentations are supposed to look like for this dataset? From what I see the first image might be correct but the second looks odd. My guess is that the submitters to TCIA had labelmaps and decided it was easier to put them in a single segment and call it fractional rather than going through the process to make bitplanes.

The images are described here but it's hard to tell how that would map to what we see here. They say "Derived objects from the DWI and DCE acquisitions are included for the convenience of the user. Some of these are not strictly DICOM compliant and may not be readable by all DICOM software packages." So I'm not sure what we should expect.

Regarding the tests, yes, there's a PR to fix that: https://github.com/dcmjs-org/dcmjs/pull/308 following conversation here: https://github.com/dcmjs-org/dcmjs/pull/303

fedorov commented 1 year ago

Do we know what the segmentations are supposed to look like for this dataset?

I am afraid we do not!

Very unfortunately, David Newitt at UCSF, who did this conversion using in-house matlab code (I believe), retired, and there isn't anyone to ask for clarifications about those segmentations.

These are the collections where FRACTIONAL is encountered: ispy2, qiba_ct_1c, breast_mri_nact_pilot, acrin_6698, ispy1. All except the QIBA one come from David Newitt / UCSF, from what I know :-(

I can generate sample URLs for image+segmentation pairs from each of the collections mentioned, if this helps. But this becomes complete guesswork... Maybe indeed we should shelve support of those collections until we have FRACTIONAL support implemented.

Punzo commented 1 year ago

@Punzo thanks for digging into this +1

Do we know what the segmentations are supposed to look like for this dataset? From what I see the first image might be correct but the second looks odd. My guess is that the submitters to TCIA had labelmaps and decided it was easier to put them in a single segment and call it fractional rather than going through the process to make bitplanes.

The images are described here but it's hard to tell how that would map to what we see here. They say "Derived objects from the DWI and DCE acquisitions are included for the convenience of the user. Some of these are not strictly DICOM compliant and may not be readable by all DICOM software packages." So I'm not sure what we should expect.

not sure, but the second segmentation is defined as a mask, so maybe is the mask used in the seg and why is so ackward. Please let me know if I should take any other action.

Regarding the tests, yes, there's a PR to fix that: #308 following conversation here: #303

Nice!

Punzo commented 1 year ago

Maybe indeed we should shelve support of those collections until we have FRACTIONAL support implemented.

Up to you, but even with proper support, the shape of the segmentation will not change respect the one in my second screenshot (only the edges will be "smoothed" with proper support). In the conversion to binary, I just have essentially taken the pixels with highest value (48) and put their value to 1 (and 0 to all others pixels). So the shape of seg would not change. See https://github.com/dcmjs-org/dcmjs/pull/309/files#diff-077798721d0df020ee8184a2f54d4459dba916e19c86ac0f0aecc4e6ea80bf2fR1274-R1303

pieper commented 1 year ago

Maybe indeed we should shelve support of those collections

Given this discussion that's what I would vote for.

Punzo commented 1 year ago

Maybe indeed we should shelve support of those collections.

Given this discussion that's what I would vote for.

Ok then for the second segmentation, i will just remove the code that does the thresholding and I will reput the warning that fractional segs are not supported. The first seg will be loaded as a binary.

Punzo commented 1 year ago

Maybe indeed we should shelve support of those collections.

Given this discussion that's what I would vote for.

Ok then for the second segmentation, i will just remove the code that does the thresholding and I will reput the warning that fractional segs are not supported. The first seg will be loaded as a binary.

Done in https://github.com/dcmjs-org/dcmjs/pull/309/commits/4736ff15c326310b73bb5741df58ead959955561 @pieper @fedorov. Please let me know if I need to address anything else. As soon as this will be merged, I will update OHIF.

Punzo commented 1 year ago

Maybe indeed we should shelve support of those collections

Given this discussion that's what I would vote for.

P.S.: if you meant that we should just not support the full collection, I will leave to you to close this PR.

pieper commented 1 year ago

Good question @Punzo . @fedorov do you see any value in supporting the special case of a fractional seg that only has one non-zero value? I tend to think it is better to be consistent and avoid displaying something that is not cleanly encoded.

Punzo commented 1 year ago

@fedorov should we close this?

fedorov commented 1 year ago

I feel bad about discarding all the work you have done, but I agree with Steve - implementing custom rules based on assumptions can be dangerous. Let me also discuss this with @dclunie before we finalize the decision and close this PR.

fedorov commented 1 year ago

For the sake of completeness, this dataset is from the collection documented here: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=50135447#501354470d1c0b53772a4fba8580e8b19f65069b

Specifically, there is Analysis mask files description. However, conventions described in this document do not seem to be followed in the sample investigated here.

image

dclunie commented 1 year ago

I looked at one of these, and even though it does not appear to follow the exact description mentioned, it does look suspiciously like a bit mask with various combinations bits from 0x00 through 0x20 set:

% dchist -h ACRIN-6698-102212/04-04-2002-102212T0-ACRIN-6698ISPY2MRIT0-82630/61900.000000-ISPY2\ VOLSER\ uni-lateral\ cropped\ Analysis\ Mask-08921/1-1.dcm

[0] 42423 p=0.00809155 e=0.0562311 cum=0.0562311 [0x1] 66288 p=0.0126434 e=0.0797228 cum=0.135954 [0x2] 453 p=8.64029e-05 e=0.00116631 cum=0.13712 [0x11] 117012 p=0.0223183 e=0.12243 cum=0.25955 [0x20] 21569 p=0.00411396 e=0.0326042 cum=0.292154 [0x21] 345053 p=0.0658136 e=0.258349 cum=0.550504 [0x22] 3493 p=0.000666237 e=0.00702992 cum=0.557534 [0x31] 4646589 p=0.886267 e=0.154377 cum=0.71191

This is obviously not a valid use of a DICOM SEG object if that is indeed the case (nor is what is described in the document).

But potentially something that could be split into separate binary segmentations if someone wanted to bother.

I don't think we (IDC) should expect any viewer to support such an invalid object though, regardless.

dclunie commented 1 year ago

Found a private data element within the SEG object that describes the mask:

(0x0117,0x10d3) LO Mask Legend VR= VL=<0x0088> <00000001 ( 1) : PE threshold\00000010 ( 2) : other SER filter\00010000 (16) : automatic background threshold\00100000 (32) : manual VOI >

pieper commented 1 year ago

Thank you for investigating @dclunie

I don't think we (IDC) should expect any viewer to support such an invalid object though, regardless.

Agreed, let's close this as an issue but perhaps add a link to this discussion someplace. Do we have something like release notes or a "known issues" document to collect links to discussions like this?

dclunie commented 1 year ago

Perhaps we should put in a TCIA HelpDesk request to correct the invalid SEG objects? In that, we could document the problem and suggest a solution.