ImagingDataCommons / dicom-microscopy-viewer

Web-based DICOM slide microscopy viewer library
https://imagingdatacommons.github.io/dicom-microscopy-viewer/
MIT License
100 stars 35 forks source link

IMS responds with octet-stream when image/jpeg is available #90

Closed cgorman closed 1 year ago

cgorman commented 1 year ago

I'm currently working with a Slim deployment which is retrieving JPEG encoded SM instances. The request's accept header is as follows:

multipart/related; type="image/jls"; transfer-syntax=1.2.840.10008.1.2.4.80, multipart/related; type="image/jls"; transfer-syntax=1.2.840.10008.1.2.4.81, multipart/related; type="image/jp2"; transfer-syntax=1.2.840.10008.1.2.4.90, multipart/related; type="image/jp2"; transfer-syntax=1.2.840.10008.1.2.4.91, multipart/related; type="image/jpx"; transfer-syntax=1.2.840.10008.1.2.4.92, multipart/related; type="image/jpx"; transfer-syntax=1.2.840.10008.1.2.4.93, multipart/related; type="application/octet-stream"; transfer-syntax=*, multipart/related; type="image/jpeg"; transfer-syntax=1.2.840.10008.1.2.4.50

and the response's content type is: multipart/related;start="<***@resteasy-multipart>";type="application/octet-stream"; boundary***

The source image is JPEG encoded, so I think that since JPEG appears after octet-stream, it chooses octet-stream instead. I assume the issue is introduced here: https://github.com/ImagingDataCommons/dicom-microscopy-viewer/blob/097282b5690886735f3fe6521d0774ee33b7f418/src/pyramid.js#L461-L472

Would it be sufficient to just put push the octet stream mediatype after the jpeg mediatype if statement?

hackermd commented 1 year ago

What archive backend are you using @cgorman? From an HTTP perspective, the order shouldn't be significant. I would consider this a bug in the archive and/or a lack of clarity of the DICOM standard regarding the expected behaviour.

Intuitively, I would expect the archive to return the image as it is stored if the corresponding media type is in the list of acceptable media types provided by the client. Unfortunately, that behaviour is not explicitly defined by the standard.

Some archives return the pixel data as stored when they are requested with type="application/octet-stream"; transfer-syntax=*. That behaviour appears to be standard conformant. The standard states that a request with such an accept header should result in the data being returned in the default media type for the resource category (see PS3.18 8.7.7 Accept Header Field and PS3.18 8.7.5 Acceptable Media Types). For Frame Pixel Data resources (a subset of Bulkdata resources) the default media type for JPEG compressed pixel data should be image/jpeg (see PS3.28 8.7.3.3.2 Compressed Bulkdata Media Types). cc @dclunie

hackermd commented 1 year ago

Unfortunately, archives have not done a great job in implementing this part of the standard and the behaviour varies between implementations.

Some archives include the "Available Transfer Syntax UID" attribute in the search results (see PS3.18 10.6.3.3.3 Instance Resources). If provided, the value may help us in selecting a single acceptable media type without having to provide a long list of acceptable media types, which many archives apparently struggle to handle correctly.

cc @gunterze @jasonklotzer @stevenborg for awareness

cgorman commented 1 year ago

Ah gotcha, yeah I've only been made aware of this issue when working with dcm4chee so it's possible it's an archive thing. We are trying a few different archives so I'll let you know if we experience the same thing.

gunterze commented 1 year ago

dcm4chee-arc 5.29.3 will prioritize Media Types in Accept Header or Accept Query Parameter for compressed formats over Media Types for uncompressed formats with equal q value https://github.com/dcm4che/dcm4chee-arc-light/issues/3981

Recommend to specify a q value < 1.0 for multipart/related; type="application/octet-stream", to do not rely on how an archive prioritize Media Types with equal q value.

dclunie commented 1 year ago

I am not sure how may servers implement and return it, but in CP 1901 we did add as an instance-level query parameter the Available Transfer Syntax UID (0008,3002) to describe the Transfer Syntaxes that the origin server will assure will be supported for retrieval of the SOP Instance.

The idea is that you can find out (per instance) what the appropriate Transfer Syntax is, if you want it the way it is stored (or can losslessly be converted to), and then you can craft your Accept header very narrowly accordingly.

See PS3.18 10.6.3.3.3 Instance Resources and PS3.4 C.6.1.1.5.2 Available SOP Transfer Syntax UID.

I haven't checked to see whether the Google DicomStore used by IDC actually implements this or not (but it isn't mention in their documentation), and if not, we should probably put in a feature request, particularly (but not only) if any of our viewers (Slim or OHIF) would find this useful. (@fedorov)

@gunterze, does dcm4chee-arc 5? I haven't tried it, but I think it does, based on this closed as implemented tracker item.

gunterze commented 1 year ago

Yes it does. E.g.:

$ curl http://localhost:8080/dcm4chee-arc/aets/DCM4CHEE/rs/studies/2.16.840.1.113995.3.110.3.0.10118.6000009.497166.721604/instances | jq '.[]["00083002"]'
{
  "vr": "UI",
  "Value": [
    "1.2.840.10008.1.2.4.50"
  ]
}
cgorman commented 1 year ago

Created issue and PR in dcmjs dicomweb-client.