Open fedorov opened 1 month ago
For instance, if I want to debug this specific study without downloading it, how can I do that in my local OHIF since you said the DICOMweb server is public?
CC @igoroctaviano
@sedghi I am not an OHIF developer, so I cannot answer how to do it in OHIF configuration (@igoroctaviano can), but IDC public DICOMweb endpoint is this: https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb
(documented in https://learn.canceridc.dev/portal/proxy-policy). Note that the "no downloads" wording in the URL is legacy, and we do not restrict that DICOMweb endpoint for downloads - as discussed in the link above.
DICOM PlanarConfiguration
will indicate whether color image is ordered as RGBRGB or RRGGBB. Need to investigate.
@dclunie I checked, and for Visible Human, we have both options for PlanarConfiguration
. Neither one works.
SELECT distinct(PatientID), StudyInstanceUID, SeriesInstanceUID, PlanarConfiguration
FROM `bigquery-public-data.idc_current.dicom_all`
where Modality = "XC"
PatientID StudyInstanceUID SeriesInstanceUID PlanarConfiguration
VHP-M 1.3.6.1.4.1.5962.1.2.0.1672334394.26545 1.3.6.1.4.1.5962.1.3.0.2.1672334394.26545 0
VHP-F 1.3.6.1.4.1.5962.1.2.0.1677425356.1733 1.3.6.1.4.1.5962.1.3.0.1.1677425356.1733 1
VHP-M 1.3.6.1.4.1.5962.1.2.0.1672334394.26545 1.3.6.1.4.1.5962.1.3.0.1.1672334394.26545 1
VHP-F 1.3.6.1.4.1.5962.1.2.0.1677425356.1733 1.3.6.1.4.1.5962.1.3.0.2.1677425356.1733 0
At the same time, I checked the following 2 US series, with different options
1 | LiverUS-04 | 1.3.6.1.4.1.14519.5.2.1.1188.2803.279513797246517539833297938175 | 1.3.6.1.4.1.14519.5.2.1.1188.2803.240884965319239940503812066809 | 0 |
2 | C3N-02436 | 1.3.6.1.4.1.14519.5.2.1.7085.2626.149644418574806774239324529564 | 1.3.6.1.4.1.14519.5.2.1.7085.2626.252950437956329338256830139567 |
and both work fine:
So it must be something else, or combination with something else...
Must be something else then. I get the impression that color image handling in OHIF is extremely fragile (whether compressed or uncompressed).
@dclunie what could be causing the imagine not to show as a single image in current v3 version, but rather a bunch of images, while it's shown as a single image in v2?
@dclunie what could be causing the imagine not to show as a single image in current v3 version, but rather a bunch of images, while it's shown as a single image in v2?
@pedrokohler, this is typically related to a decoding issue. I'd suggest rolling back to commits before any recent RGB/decoding changes, testing them one by one.
@sedghi, it might be a good idea to open an issue to add snapshot testing in CS3D to help prevent these kinds of regressions.
Here’s a similar issue we encountered previously: https://github.com/cornerstonejs/cornerstone3D/pull/829
There's a widespread problem with DICOM color images. Here's the core issue:
Take a look at these discussions for instance
I need help fixing the color settings permanently.
@sedghi if you can provide a reproducible example demonstrating DICOM non-compliance of Google Healthcare API DICOMweb implementation, we can escalate through our collaborators at Google to try to get it fixed. I do not know enough about this issue to reproduce it myself.
Can @dclunie clarify the DICOM standard's requirements for Photometric Interpretation? Should we rely on that? since we are seeing inconsistencies with that tag and what actually the server responds in pixelData
@sedghi @fedorov after lots of testing, I found that this PR was the responsible for making it fail: https://github.com/cornerstonejs/cornerstone3D/pull/829/files
I'll investigate further to see what exactly is happening there
@pedrokohler thank you, but let's wait to hear from @dclunie about the question asked by Alireza first!
Very long ... sorry ...
Here are few points to consider wrt. what the standard says about this and how it relates to @sedghi's description of the recent changes, and what might be needed to make this more robust going forwards:
From an application implementation perspective this can be even more challenging depending on what the underlying toolkits and libraries and codecs do and how much control there is over what they do (e.g., a lossy baseline JPEG decoder might always attempt YCbCr to RGB conversion even if they really were RGB, or always do it if a JFIF header is present, or check for the Adobe marker segment that signals RGB, or whatever).
To restate the obvious, the only way to mitigate against unexpected code changes causing failures is to test - have a good range of test inputs representing as many known good (and bad) variants as possible, and a decoded result to expect, and automate the comparison in regression tests. For properly encoded DICOM images (uncompressed, compressed and decompressed/re-encoded) there should be no question about the correct result. For "bad" images (e.g., RGB Photometric Interpretation for a YCbCr encoded lossy JPEG bitstream signaled by a JFIF header, or a decompressed JPEG that still has YBR_FULL_422 for the Photometric Interpretation) some heuristics may be implemented to work around this and need to be tested too but the heuristic should never prevent the correct handling of a correct image.
In the case of the Cornerstone PR #829 that was mentioned as responsible for the problem, there is a new function isColorConversionRequired() that is supposed to work around an Orthanc server bug "[Orthanc convert YBR to RGB but does not change metadata](https://discourse.orthanc-server.org/t/orthanc-convert-ybr-to-rgb-but-does-not-change-metadata/3533)", and which @sedghi also referred to. Unfortunately it seems that insufficient testing was done to assure that this workaround did not undermine handling of "correct" images in which the codec had handled the color component transformation (and chrominance channel upsampling; repeating it would screw things up, if that is what is actually happening.
The heuristic in isColorConversionRequired() seems to be trying to detect uncompressed chrominance downsampling based on the downsampling description in Photometric Interpretation matching the decompressed pixel data length, which is a whole other thing (and as @sedghi described, is often related to funky ultrasound images; some uncompressed or decompressed RLE ultrasound images really do have uncompressed color component downsampling). I don't actually see how the function (and its consequences if it returns true) actually specifically target the Orthanc defect anyway, other than by assuming that if it says YBR_FULL_422 but the decompressed pixel data length is not downsampled (i.e., it just equals columns rows samplesperpixel * bytespersample; I think the imageFrame must already exclude the padding byte to even judging by other code therein).
However, since our problem images should be Photometric Interpretation RGB and uncompressed, I am not sure how this PR is causing our images to fail, since isColorConversionRequired() should return false by my reading of it.
I assume there is no issue with the DICOMweb side of things, in that the image requested was returned uncompressed. If it had been requested with an Accept header of image/jpeg rather than application/octet-stream, and the server has lossy JPEG compressed them to satisfy the frame request (perhaps even retrieve rendered), and of course while doing that would very likely color space transform them, with no explicit means of signaling that (except perhaps a JFIF header included in the compressed bitstream). See also the warning about this sort of thing in the description in PS3.18. I.e., the metadata response and the bulk pixel data response would be inconsistent in such cases wrt. the characteristics of the pixel data. This is purely speculative on my part since I have now idea how OHIF is retrieving these images, or under what circumstances the routines in the PR get invoked.
I also see a bunch of stuff in the PR about Palette Color Lookup Tables, but that should not be relevant to us since they won't be in the problem image.
@pedrokohler you should also use samples mentioned in https://github.com/OHIF/Viewers/issues/2934! Also samples from DICOM FTP server: ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG12/.
@fedorov @dclunie the problem is the isColorConversionRequired
function of the PR I sent above.
Once I force it to return true for the dataset above, it renders the image correctly.
@dclunie what else could be checked inside this function so that we would be able to tell that this image requires conversion? For instance, I could check to see if the modality is XC and always return true in this case - note that I don't know if this makes any sense, I'm just giving an example.
Once I force it to return true for the dataset above, it renders the image correctly.
Very interesting observation.
It makes little sense though, since this really is an RGB image, so by definition should not require "color conversion".
I can only assume that whatever action is being (needs to be) triggered by isColorConversionRequired()
returning true is doing more than "color conversion" - perhaps related to changing PlanarConfiguration of 1 (color-by-plane) to 0 (color-by-pixel)?
I could check to see if the modality is XC and always return true in this case
Please do not do that - you need to understand the root cause of the problem with what is perfectly good input data and fix it. There is nothing special about the modality or any other attribute that is not directly related to the description of the encoding of the pixel data (i.e., Image Pixel Module attributes and Transfer Syntax).
I think you need to figure out what "color conversion" entails and whether or not it can safely be triggered by PlanarConfiguration == 1, or if it needs to be refactored (to separate "color component conversion" (e.g, from YBR to RGB, +/- chrominance channel upsampling) from "plane organization" (color-by-plane to color-by-pixel).
Who wrote the "color conversion" code? You could talk to them about their design for this.
@dclunie Thank you David for the excellent DICOM color crash course. It will help us improve our color handling stability.
@fedorov It looks like this was really a cornerstone3D issue but it was already fixed in this PR https://github.com/cornerstonejs/cornerstone3D/pull/1457/files
I'm waiting for Bill Longabaugh to confirm that ViewersV3 does not have the master branch deployed, but rather an older branch that has a previous cs3d version being used.
If this is indeed the case, then we just need to deploy the master branch and the issue should be fixed.
I'm waiting for Bill Longabaugh to confirm that idc-slim does not have the master branch deployed, but rather an older branch that has a previous cs3d version being used.
You mean ViewersV3, not idc-slim, right?
You mean ViewersV3, not idc-slim, right?
My bad. Yes, I mean ViewersV3.
Replicating from Slack and to keep @dclunie in the loop. @pedrokohler can you please take a look?
Per @wlongabaugh:
OK, in the past, it appears the > 32 MB slices got sent as roughly ~20 MB slices, and this appeared to be due to a change in the Accept: transfer-syntax=* request header: https://github.com/ImagingDataCommons/IDC-ProjectManagement/issues/1574#issuecomment-1924983645
1:13 OK, I admit I was wrong, and yes it used to work, by getting the > 32 MB slices to work by getting them to be ~20 MB on the wire.
I note the current V3 viewer is sending this:
accept:
multipart/related; type=application/octet-stream; transfer-syntax=*; transfer-syntax=*
1:12
Is the second transfer-syntax=*
(and w/o a ";") messing things up?
Checkout extensions/cornerstone/src/initWADOImageLoader.js that is the location that decide on the transfer syntax
and
platform/core/src/utils/generateAcceptHeader.ts
@fedorov I removed the duplicate and just tested it with transfer-syntax=*
and with transfer-syntax=*;
, meaning both with and without semicolon. Still same error occurs in both cases.
@fedorov either way this error has nothing to do with this ticket. According to @wlongabaugh on Slack this same error is occurring in production right now.
Thank you for checking. Let's wait to see what @wlongabaugh says about this.
Sometimes the type should be in quotes. We used to have an option to omit it, as including it can trigger an options request flow on some servers, slowing things down. However, sometimes it requires the type to be in quotes, like multipart/related; type="application/octet-stream"; transfer-syntax=*
.
@wlongabaugh @pedrokohler note that while accessing IDC DICOM store directly - bypassing the proxy - I am able to visualize both series.
See demo here: https://app.screencast.com/QEePMcwuQOuWb.
Also note the type is indeed in quotes:
No server error
For this experiment, I utilized the v3 instance deployed by @igoroctaviano via https://github.com/ImagingDataCommons/idc-viewers-sandbox-gha-testing. I do not know where that version is relative to various IDC tiers.
Could it be that the proxy started dropping the quotes @wlongabaugh?
What size are the slices that are being sent directly from Google? Are they 32+ MB, or ~20 MB? If Google has taken to ignoring the fact that we are allowing a gzip transfer encoding to be sent, that would be the problem. @fedorov
To clarify, I have no doubt that a direct call to Google would work. The question is whether the direct call returns a compressed slice or not. If it is uncompressed, that yes the Viewer will read it, but it would not survive a pass through the proxy since it is no longer compressed.
@fedorov The code that forwards the headers from the client to the Google backend has not changed.
The proxy code that passes the headers through to the backend is
headers={key: value for (key, value) in request.headers if key != 'Host'},
so however the Python requests package treats header quotes is what is happening.
It is true that the Python version has been upgraded, as the old version was no longer supported. Seems unlikely that the requests behavior would change for handling headers, but that is a difference.
But I see that the 32M+ slices series no longer loads in production either, where the headers are what they have always been:
accept: multipart/related; type=application/octet-stream; transfer-syntax=*
accept-encoding: gzip, deflate, br, zstd
the Python version is still 3.8, and the proxy code has not changed in many months. That is still running against v1 (not v1beta1) which implies Google changed some behavior since we deployed in the summer of 2023.
the headers are what they have always been
I have no means to ascertain what they have always been. I do observe that when I use OHIF sandbox instance against IDC DICOM store directly, the parameters are in quotes. I do not know if this has any effect on the issue we are discussing.
@fedorov The important question remains. What size are the slices coming back from the store when hit directly: 32M+ or ~20MB?
@fedorov @dclunie @sedghi I was able to make it work with the current version just by forcing a transfer-syntax=1.2.840.10008.1.2.1 instead of transfer-syntax=*
@dclunie @fedorov do we want to use this transfer-syntax in other cases as well or just this one?
@sedghi what's the best way to make OHIF use this transfer-syntax in this specific case or any other cases that Clunie and Fedorov specify?
@pedrokohler it is more important to understand the source of the regression than to find a workaround to just make it work. Can you please try all the items we discussed yesterday, and report on the findings?
Hmm, we don't handle transfer syntax on a case-by-case basis. Honestly, I don't think there's a good way to do this. Didn't Pedro try the latest Cornerstone3D, and wasn't it working?
To make sure we don't forget: this sample used to work without the need to explicitly specify transfer syntax back in February 2024!
You can remove the transfer syntax and try it; it's not mandatory.
@fedorov Again, can we please find out what size are the slices coming back from the store when hit directly: 32M+ or ~20MB?
@pedrokohler it is more important to understand the source of the regression than to find a workaround to just make it work. Can you please try all the items we discussed yesterday, and report on the findings?
- Try to not include transfer-syntax=*, try not to have quotes, try to request single part
- Look at the changes in the code around header
- Go back to the version of ViewersV3 from February and test against the proxy and against DICOM Google server directly
@fedorov I went back the versions of idc's ViewersV3 up to February and it did not work. The same issue happened.
@fedorov Again, can we please find out what size are the slices coming back from the store when hit directly: 32M+ or ~20MB?
It's over 32MB
Thanks for that info. I think we need to find out from Google if they changed things.
If Google is not going to make it work again, we have two options. I can either run the proxy on App Engine Flex to handle > 32 MB, or the proxy could do special handling of these studies by looking for the series ID and changing the transfer encoding to "transfer-syntax=1.2.840.10008.1.2.1". I have been hesitant to go to Flex, since we will need to extensively test the performance of spooling up new instances under shock loading from multiple users.
I'm fixing the double transfer syntax here just FYI https://github.com/OHIF/Viewers/pull/4478
@sedghi @fedorov @wlongabaugh FYI removing the transfer-syntax completely also works
removing the transfer-syntax completely also works
Really? We had to add that to get it to compress back in 2023. That's weird. Sounds like a simple fix then?
@pedrokohler If your test that dropping the transfer-syntax worked was against v1 of the API, you need to check against v1beta1 as well.
FYI removing the transfer-syntax completely also works
@pedrokohler as before, when you say "works", can you confirm if the content returned is compressed or not?
removing the transfer-syntax completely also works
Really? We had to add that to get it to compress back in 2023. That's weird. Sounds like a simple fix then?
My read of the issue that you brought back is that we indeed had to remove it to compress back in February (not in 2023, based on the comment date). See below the screenshot of the issue that is in our private repo that you mentioned on slack.
Describe the Bug
This sample is no longer rendered correctly: https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.5962.1.2.0.1677425356.1733
For comparison, see how it used to work here: https://github.com/OHIF/Viewers/issues/3354#issuecomment-1673389784:
To download the files in that study (note - it is ~233GB!), assuming you have python and pip on your system, do the following:
Steps to Reproduce
Load the sample study referenced in the above into the viewer.
The current behavior
First series shows grayscale with multiple slices arranged in a mosaic. Second does not load at all.
The expected behavior
Both should show up in color as used to work in the past given the screenshot in https://github.com/OHIF/Viewers/issues/3354#issuecomment-1673389784
OS
macOS
Node version
n/a
Browser
Chrome