[Bug] Regression handling RGB in XC modality

fedorov commented 2 months ago

Describe the Bug

This sample is no longer rendered correctly: https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.5962.1.2.0.1677425356.1733

For comparison, see how it used to work here: https://github.com/OHIF/Viewers/issues/3354#issuecomment-1673389784:

To download the files in that study (note - it is ~233GB!), assuming you have python and pip on your system, do the following:

pip install --upgrade idc-index
idc download 1.3.6.1.4.1.5962.1.2.0.1677425356.1733

Steps to Reproduce

Load the sample study referenced in the above into the viewer.

The current behavior

First series shows grayscale with multiple slices arranged in a mosaic. Second does not load at all.

The expected behavior

Both should show up in color as used to work in the past given the screenshot in https://github.com/OHIF/Viewers/issues/3354#issuecomment-1673389784

OS

macOS

Node version

n/a

Browser

Chrome

sedghi commented 2 months ago

For instance, if I want to debug this specific study without downloading it, how can I do that in my local OHIF since you said the DICOMweb server is public?

CC @igoroctaviano

fedorov commented 2 months ago

@sedghi I am not an OHIF developer, so I cannot answer how to do it in OHIF configuration (@igoroctaviano can), but IDC public DICOMweb endpoint is this: https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb (documented in https://learn.canceridc.dev/portal/proxy-policy). Note that the "no downloads" wording in the URL is legacy, and we do not restrict that DICOMweb endpoint for downloads - as discussed in the link above.

fedorov commented 2 months ago

DICOM PlanarConfiguration will indicate whether color image is ordered as RGBRGB or RRGGBB. Need to investigate.

fedorov commented 1 month ago

@dclunie I checked, and for Visible Human, we have both options for PlanarConfiguration. Neither one works.

SELECT distinct(PatientID), StudyInstanceUID, SeriesInstanceUID, PlanarConfiguration 
FROM `bigquery-public-data.idc_current.dicom_all` 
where Modality = "XC"

PatientID   StudyInstanceUID    SeriesInstanceUID   PlanarConfiguration
VHP-M   1.3.6.1.4.1.5962.1.2.0.1672334394.26545 1.3.6.1.4.1.5962.1.3.0.2.1672334394.26545   0
VHP-F   1.3.6.1.4.1.5962.1.2.0.1677425356.1733  1.3.6.1.4.1.5962.1.3.0.1.1677425356.1733    1
VHP-M   1.3.6.1.4.1.5962.1.2.0.1672334394.26545 1.3.6.1.4.1.5962.1.3.0.1.1672334394.26545   1
VHP-F   1.3.6.1.4.1.5962.1.2.0.1677425356.1733  1.3.6.1.4.1.5962.1.3.0.2.1677425356.1733    0

At the same time, I checked the following 2 US series, with different options

1 | LiverUS-04 | 1.3.6.1.4.1.14519.5.2.1.1188.2803.279513797246517539833297938175 | 1.3.6.1.4.1.14519.5.2.1.1188.2803.240884965319239940503812066809 | 0 |  
2 | C3N-02436 | 1.3.6.1.4.1.14519.5.2.1.7085.2626.149644418574806774239324529564 | 1.3.6.1.4.1.14519.5.2.1.7085.2626.252950437956329338256830139567 |

and both work fine:

https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.1188.2803.279513797246517539833297938175&SeriesInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.1188.2803.240884965319239940503812066809

https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.7085.2626.149644418574806774239324529564&SeriesInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.7085.2626.252950437956329338256830139567

So it must be something else, or combination with something else...

dclunie commented 1 month ago

Must be something else then. I get the impression that color image handling in OHIF is extremely fragile (whether compressed or uncompressed).

pedrokohler commented 1 month ago

@dclunie what could be causing the imagine not to show as a single image in current v3 version, but rather a bunch of images, while it's shown as a single image in v2?

igoroctaviano commented 1 month ago

@dclunie what could be causing the imagine not to show as a single image in current v3 version, but rather a bunch of images, while it's shown as a single image in v2?

@pedrokohler, this is typically related to a decoding issue. I'd suggest rolling back to commits before any recent RGB/decoding changes, testing them one by one.

@sedghi, it might be a good idea to open an issue to add snapshot testing in CS3D to help prevent these kinds of regressions.

Here’s a similar issue we encountered previously: https://github.com/cornerstonejs/cornerstone3D/pull/829

sedghi commented 1 month ago

There's a widespread problem with DICOM color images. Here's the core issue:

When dicom servers sends DICOM color images, they can convert the pixel data from one color space (like YBR422) to another (like RGB) as we know, however, these servers frequently don't update the metadata to reflect this conversion. So you end up with a mismatch - the metadata might say "YBR422" even though the actual pixel data has been converted to RGB.
Apparently the DICOM standard suggests that the Photometric Interpretation field should reflect the format of the stored data (am I right @dclunie ?). But this raises a question - what should happen when the data gets converted?
Due to this widespread inconsistency, we can't reliably ONLY trust the Photometric Interpretation value in the metadata, we should look at the pixelData too. To work around this, we developed a solution that tries to detect the actual pixel data format by analyzing the values themselves (implemented here). While this approach improved our ultrasound image support and made things more consistent, we're still encountering some issues with it as you see in this issue

Take a look at these discussions for instance

https://discourse.orthanc-server.org/t/orthanc-convert-ybr-to-rgb-but-does-not-change-metadata/3533/3

I need help fixing the color settings permanently.

fedorov commented 1 month ago

@sedghi if you can provide a reproducible example demonstrating DICOM non-compliance of Google Healthcare API DICOMweb implementation, we can escalate through our collaborators at Google to try to get it fixed. I do not know enough about this issue to reproduce it myself.

sedghi commented 1 month ago

Can @dclunie clarify the DICOM standard's requirements for Photometric Interpretation? Should we rely on that? since we are seeing inconsistencies with that tag and what actually the server responds in pixelData

pedrokohler commented 1 month ago

@sedghi @fedorov after lots of testing, I found that this PR was the responsible for making it fail: https://github.com/cornerstonejs/cornerstone3D/pull/829/files

I'll investigate further to see what exactly is happening there

fedorov commented 1 month ago

@pedrokohler thank you, but let's wait to hear from @dclunie about the question asked by Alireza first!

dclunie commented 1 month ago

Very long ... sorry ...

Here are few points to consider wrt. what the standard says about this and how it relates to @sedghi's description of the recent changes, and what might be needed to make this more robust going forwards:

when an image is encoded in an uncompressed transfer syntax, all of the DICOM attributes in the Image Pixel Module need to accurately describe the binary contents of the uncompressed PixelData attribute, including PhotometricInterpretation - e.g., if the pixels are RGB, then that's what it needs to say, if they really are YBR (e.g., were encoded that way be the ultrasound machine) then again, that's what PhotometricInterpretation needs to say (usually YBR_FULL); see "Native" format in PS3.5 and Photometric Interpretation in PS3.3.
if an image is decompressed, whether from a lossless or lossy transfer syntax, and re-encoded in DICOM, obviously the decompressed file is required to have the correct description of the decompressed PixelData in the Image Pixel Module attributes - the classic example being when a JPEG image in YBR_FULL_422 is color component converted into RGB during decompression; what the standard requires in this respect is obvious; problems arise when the encoder of the decompressed image fails to set Photometric Interpretation correctly - viewers may try to apply heuristics to work around this (e.g., it's YBR_FULL_422 but decompressed so its probably wrong so I will assume RGB) but like any heuristics, they may fail, as we are apparently seeing here
if an image is compressed, whether losslessly or lossy, then the image pixel data characteristics (including the meaning of the color components +/- downsampling of the chrominance channels) may or may not be expressed in metadata within the compressed bit stream, but if it is, that is always to be used "to control the decoding"; that said, the DICOM Image Pixel Module attributes are still required to be "consistent with the characteristics of the compressed data stream" (even though in practice they may sometimes not be); for lossy baseline JPEG, the compressed bitstream might not describe the color components, so DICOM Photometric Interpretation of YBR_FULL_422 is really important to signal this, and RGB for compressed data is intended to indicated that the color components have NOT been transformed (as is the case with Leica/Aperio WSI, for example); the presence of a JFIF header is not required in the compressed bitstream, but if present, may signal YCbCr components rather than untransformed RGB components

From an application implementation perspective this can be even more challenging depending on what the underlying toolkits and libraries and codecs do and how much control there is over what they do (e.g., a lossy baseline JPEG decoder might always attempt YCbCr to RGB conversion even if they really were RGB, or always do it if a JFIF header is present, or check for the Adobe marker segment that signals RGB, or whatever).

To restate the obvious, the only way to mitigate against unexpected code changes causing failures is to test - have a good range of test inputs representing as many known good (and bad) variants as possible, and a decoded result to expect, and automate the comparison in regression tests. For properly encoded DICOM images (uncompressed, compressed and decompressed/re-encoded) there should be no question about the correct result. For "bad" images (e.g., RGB Photometric Interpretation for a YCbCr encoded lossy JPEG bitstream signaled by a JFIF header, or a decompressed JPEG that still has YBR_FULL_422 for the Photometric Interpretation) some heuristics may be implemented to work around this and need to be tested too but the heuristic should never prevent the correct handling of a correct image.

In the case of the Cornerstone PR #829 that was mentioned as responsible for the problem, there is a new function isColorConversionRequired() that is supposed to work around an Orthanc server bug "[Orthanc convert YBR to RGB but does not change metadata](https://discourse.orthanc-server.org/t/orthanc-convert-ybr-to-rgb-but-does-not-change-metadata/3533)", and which @sedghi also referred to. Unfortunately it seems that insufficient testing was done to assure that this workaround did not undermine handling of "correct" images in which the codec had handled the color component transformation (and chrominance channel upsampling; repeating it would screw things up, if that is what is actually happening.

The heuristic in isColorConversionRequired() seems to be trying to detect uncompressed chrominance downsampling based on the downsampling description in Photometric Interpretation matching the decompressed pixel data length, which is a whole other thing (and as @sedghi described, is often related to funky ultrasound images; some uncompressed or decompressed RLE ultrasound images really do have uncompressed color component downsampling). I don't actually see how the function (and its consequences if it returns true) actually specifically target the Orthanc defect anyway, other than by assuming that if it says YBR_FULL_422 but the decompressed pixel data length is not downsampled (i.e., it just equals columns rows samplesperpixel * bytespersample; I think the imageFrame must already exclude the padding byte to even judging by other code therein).

However, since our problem images should be Photometric Interpretation RGB and uncompressed, I am not sure how this PR is causing our images to fail, since isColorConversionRequired() should return false by my reading of it.

I assume there is no issue with the DICOMweb side of things, in that the image requested was returned uncompressed. If it had been requested with an Accept header of image/jpeg rather than application/octet-stream, and the server has lossy JPEG compressed them to satisfy the frame request (perhaps even retrieve rendered), and of course while doing that would very likely color space transform them, with no explicit means of signaling that (except perhaps a JFIF header included in the compressed bitstream). See also the warning about this sort of thing in the description in PS3.18. I.e., the metadata response and the bulk pixel data response would be inconsistent in such cases wrt. the characteristics of the pixel data. This is purely speculative on my part since I have now idea how OHIF is retrieving these images, or under what circumstances the routines in the PR get invoked.

I also see a bunch of stuff in the PR about Palette Color Lookup Tables, but that should not be relevant to us since they won't be in the problem image.

fedorov commented 1 month ago

@pedrokohler you should also use samples mentioned in https://github.com/OHIF/Viewers/issues/2934! Also samples from DICOM FTP server: ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG12/.

pedrokohler commented 1 month ago

@fedorov @dclunie the problem is the isColorConversionRequired function of the PR I sent above.

Once I force it to return true for the dataset above, it renders the image correctly.

@dclunie what else could be checked inside this function so that we would be able to tell that this image requires conversion? For instance, I could check to see if the modality is XC and always return true in this case - note that I don't know if this makes any sense, I'm just giving an example.

dclunie commented 1 month ago

Once I force it to return true for the dataset above, it renders the image correctly.

Very interesting observation.

It makes little sense though, since this really is an RGB image, so by definition should not require "color conversion".

I can only assume that whatever action is being (needs to be) triggered by isColorConversionRequired() returning true is doing more than "color conversion" - perhaps related to changing PlanarConfiguration of 1 (color-by-plane) to 0 (color-by-pixel)?

I could check to see if the modality is XC and always return true in this case

Please do not do that - you need to understand the root cause of the problem with what is perfectly good input data and fix it. There is nothing special about the modality or any other attribute that is not directly related to the description of the encoding of the pixel data (i.e., Image Pixel Module attributes and Transfer Syntax).

I think you need to figure out what "color conversion" entails and whether or not it can safely be triggered by PlanarConfiguration == 1, or if it needs to be refactored (to separate "color component conversion" (e.g, from YBR to RGB, +/- chrominance channel upsampling) from "plane organization" (color-by-plane to color-by-pixel).

Who wrote the "color conversion" code? You could talk to them about their design for this.

sedghi commented 1 month ago

@dclunie Thank you David for the excellent DICOM color crash course. It will help us improve our color handling stability.

pedrokohler commented 1 month ago

@fedorov It looks like this was really a cornerstone3D issue but it was already fixed in this PR https://github.com/cornerstonejs/cornerstone3D/pull/1457/files

I'm waiting for Bill Longabaugh to confirm that ViewersV3 does not have the master branch deployed, but rather an older branch that has a previous cs3d version being used.

If this is indeed the case, then we just need to deploy the master branch and the issue should be fixed.

fedorov commented 1 month ago

I'm waiting for Bill Longabaugh to confirm that idc-slim does not have the master branch deployed, but rather an older branch that has a previous cs3d version being used.

You mean ViewersV3, not idc-slim, right?

pedrokohler commented 1 month ago

You mean ViewersV3, not idc-slim, right?

My bad. Yes, I mean ViewersV3.

fedorov commented 3 weeks ago

Replicating from Slack and to keep @dclunie in the loop. @pedrokohler can you please take a look?

Per @wlongabaugh:

OK, in the past, it appears the > 32 MB slices got sent as roughly ~20 MB slices, and this appeared to be due to a change in the Accept: transfer-syntax=* request header: https://github.com/ImagingDataCommons/IDC-ProjectManagement/issues/1574#issuecomment-1924983645

1:13 OK, I admit I was wrong, and yes it used to work, by getting the > 32 MB slices to work by getting them to be ~20 MB on the wire.

I note the current V3 viewer is sending this:

accept:
multipart/related; type=application/octet-stream; transfer-syntax=*; transfer-syntax=*

1:12 Is the second transfer-syntax=* (and w/o a ";") messing things up?

sedghi commented 3 weeks ago

Checkout extensions/cornerstone/src/initWADOImageLoader.js that is the location that decide on the transfer syntax

and

platform/core/src/utils/generateAcceptHeader.ts

pedrokohler commented 3 weeks ago

@fedorov I removed the duplicate and just tested it with transfer-syntax=* and with transfer-syntax=*;, meaning both with and without semicolon. Still same error occurs in both cases.

pedrokohler commented 3 weeks ago

@fedorov either way this error has nothing to do with this ticket. According to @wlongabaugh on Slack this same error is occurring in production right now.

fedorov commented 3 weeks ago

Thank you for checking. Let's wait to see what @wlongabaugh says about this.

sedghi commented 3 weeks ago

Sometimes the type should be in quotes. We used to have an option to omit it, as including it can trigger an options request flow on some servers, slowing things down. However, sometimes it requires the type to be in quotes, like multipart/related; type="application/octet-stream"; transfer-syntax=*.

fedorov commented 3 weeks ago

@wlongabaugh @pedrokohler note that while accessing IDC DICOM store directly - bypassing the proxy - I am able to visualize both series.

See demo here: https://app.screencast.com/QEePMcwuQOuWb.

Also note the type is indeed in quotes:

No server error

fedorov commented 3 weeks ago

For this experiment, I utilized the v3 instance deployed by @igoroctaviano via https://github.com/ImagingDataCommons/idc-viewers-sandbox-gha-testing. I do not know where that version is relative to various IDC tiers.

fedorov commented 3 weeks ago

Could it be that the proxy started dropping the quotes @wlongabaugh?

wlongabaugh commented 3 weeks ago

What size are the slices that are being sent directly from Google? Are they 32+ MB, or ~20 MB? If Google has taken to ignoring the fact that we are allowing a gzip transfer encoding to be sent, that would be the problem. @fedorov

To clarify, I have no doubt that a direct call to Google would work. The question is whether the direct call returns a compressed slice or not. If it is uncompressed, that yes the Viewer will read it, but it would not survive a pass through the proxy since it is no longer compressed.

wlongabaugh commented 3 weeks ago

@fedorov The code that forwards the headers from the client to the Google backend has not changed.

wlongabaugh commented 3 weeks ago

The proxy code that passes the headers through to the backend is

headers={key: value for (key, value) in request.headers if key != 'Host'},

so however the Python requests package treats header quotes is what is happening.

wlongabaugh commented 3 weeks ago

It is true that the Python version has been upgraded, as the old version was no longer supported. Seems unlikely that the requests behavior would change for handling headers, but that is a difference.

wlongabaugh commented 3 weeks ago

But I see that the 32M+ slices series no longer loads in production either, where the headers are what they have always been:

accept: multipart/related; type=application/octet-stream; transfer-syntax=*
accept-encoding: gzip, deflate, br, zstd

the Python version is still 3.8, and the proxy code has not changed in many months. That is still running against v1 (not v1beta1) which implies Google changed some behavior since we deployed in the summer of 2023.

fedorov commented 3 weeks ago

the headers are what they have always been

I have no means to ascertain what they have always been. I do observe that when I use OHIF sandbox instance against IDC DICOM store directly, the parameters are in quotes. I do not know if this has any effect on the issue we are discussing.

wlongabaugh commented 3 weeks ago

@fedorov The important question remains. What size are the slices coming back from the store when hit directly: 32M+ or ~20MB?

pedrokohler commented 3 weeks ago

@fedorov @dclunie @sedghi I was able to make it work with the current version just by forcing a transfer-syntax=1.2.840.10008.1.2.1 instead of transfer-syntax=*

@dclunie @fedorov do we want to use this transfer-syntax in other cases as well or just this one?

@sedghi what's the best way to make OHIF use this transfer-syntax in this specific case or any other cases that Clunie and Fedorov specify?

fedorov commented 3 weeks ago

@pedrokohler it is more important to understand the source of the regression than to find a workaround to just make it work. Can you please try all the items we discussed yesterday, and report on the findings?

Try to not include transfer-syntax=*, try not to have quotes, try to request single part
Look at the changes in the code around header
Go back to the version of ViewersV3 from February and test against the proxy and against DICOM Google server directly

sedghi commented 3 weeks ago

Hmm, we don't handle transfer syntax on a case-by-case basis. Honestly, I don't think there's a good way to do this. Didn't Pedro try the latest Cornerstone3D, and wasn't it working?

fedorov commented 3 weeks ago

To make sure we don't forget: this sample used to work without the need to explicitly specify transfer syntax back in February 2024!

sedghi commented 3 weeks ago

You can remove the transfer syntax and try it; it's not mandatory.

wlongabaugh commented 3 weeks ago

@fedorov Again, can we please find out what size are the slices coming back from the store when hit directly: 32M+ or ~20MB?

pedrokohler commented 3 weeks ago

@pedrokohler it is more important to understand the source of the regression than to find a workaround to just make it work. Can you please try all the items we discussed yesterday, and report on the findings?

Try to not include transfer-syntax=*, try not to have quotes, try to request single part

Look at the changes in the code around header

Go back to the version of ViewersV3 from February and test against the proxy and against DICOM Google server directly

@fedorov I went back the versions of idc's ViewersV3 up to February and it did not work. The same issue happened.

pedrokohler commented 3 weeks ago

@fedorov Again, can we please find out what size are the slices coming back from the store when hit directly: 32M+ or ~20MB?

It's over 32MB

wlongabaugh commented 3 weeks ago

Thanks for that info. I think we need to find out from Google if they changed things.

If Google is not going to make it work again, we have two options. I can either run the proxy on App Engine Flex to handle > 32 MB, or the proxy could do special handling of these studies by looking for the series ID and changing the transfer encoding to "transfer-syntax=1.2.840.10008.1.2.1". I have been hesitant to go to Flex, since we will need to extensively test the performance of spooling up new instances under shock loading from multiple users.

sedghi commented 3 weeks ago

I'm fixing the double transfer syntax here just FYI https://github.com/OHIF/Viewers/pull/4478

pedrokohler commented 3 weeks ago

@sedghi @fedorov @wlongabaugh FYI removing the transfer-syntax completely also works

wlongabaugh commented 3 weeks ago

removing the transfer-syntax completely also works

Really? We had to add that to get it to compress back in 2023. That's weird. Sounds like a simple fix then?

wlongabaugh commented 3 weeks ago

@pedrokohler If your test that dropping the transfer-syntax worked was against v1 of the API, you need to check against v1beta1 as well.

fedorov commented 3 weeks ago

FYI removing the transfer-syntax completely also works

@pedrokohler as before, when you say "works", can you confirm if the content returned is compressed or not?

removing the transfer-syntax completely also works

Really? We had to add that to get it to compress back in 2023. That's weird. Sounds like a simple fix then?

My read of the issue that you brought back is that we indeed had to remove it to compress back in February (not in 2023, based on the comment date). See below the screenshot of the issue that is in our private repo that you mentioned on slack.

OHIF / Viewers