US cine loops with Transfer Syntax RLE Lossless are corrupted after DicomMessage.write when fragmentMultiframe option is enabled

jimOnAir commented 1 year ago

Hello. I faced corrupted US cine loops with Transfer Syntax RLE Lossless when using dcmjs-dimse for receiving and sending DICOM images with C-STORE. After some research, I've found that it is related to the fragmentMultiframe option of DicomMessage.write() which enabled by default. When fragmentMultiframe is true I have the following errors with further processing:

$ find . -name '*.dcm' -exec /opt/dcm4che/bin/dcm2jpg {} {}.jpg \;
./1.2.156.112536.1.2118.80201085056.14485206710.12.dcm-patched.dcm -> ./1.2.156.112536.1.2118.80201085056.14485206710.12.dcm-patched.dcm.jpg
Native OpenCV library has been found in /opt/dcm4che/lib/macosx-aarch64/libopencv_java.dylib
08:49:55.929 DEBUG - Decompressor: org.dcm4che3.imageio.plugins.rle.RLEImageReader
08:49:55.930 DEBUG - Start decompressing frame #1
08:49:55.933 INFO  - RLE Segment #1 too short, set missing 363917 bytes to 0
08:49:55.933 INFO  - RLE Segment #2 too short, set missing 480000 bytes to 0
08:49:55.933 INFO  - RLE Segment #3 too short, set missing 480000 bytes to 0
08:49:55.933 DEBUG - Finished decompressing frame #1

$ dcmj2pnm --nointerlace --write-png --accept-acr-nema --frame 1 1.2.156.112536.1.2118.80201085056.14485206710.12.dcm-patched.dcm 1.2.156.112536.1.2118.80201085056.14485206710.12.dcm-patched.png --debug
D: $dcmtk: dcmj2pnm v3.6.7 2022-04-22 $
D:
D: DcmDataDictionary: Loading file: /usr/share/dcmtk/dicom.dic
I: reading DICOM file: 1.2.156.112536.1.2118.80201085056.14485206710.12.dcm-patched.dcm
D: DcmMetaInfo::checkAndReadPreamble() TransferSyntax="Little Endian Explicit"
D: DcmDataset::read() TransferSyntax="RLE Lossless"
I: preparing pixel data
D: transfer syntax of DICOM dataset: RLE Lossless (1.2.840.10008.1.2.5)
D: using partial read access to compressed pixel data
D: RLE decoder processes frame 0
D: RLE decoder processes pixel item 1
Segmentation fault

What is the purpose of splitting PixelData into such small chunks by default?

jimOnAir commented 1 year ago

The corresponding issue in dcmjs-dimse: #PantelisGeorgiadis/dcmjs-dimse/issues/40

pieper commented 1 year ago

What is the purpose of splitting PixelData into such small chunks by default?

Thanks for the careful debugging. it looks like there's a long history of commits related to this code, but the behavior seems to trace back to the original implementation, so there's no specific discussion of the original implementation, but it's probably for efficiency and memory management.

https://github.com/dcmjs-org/dcmjs/blame/646b1f181a3add90588175c79376308e9d1eb84b/src/ValueRepresentation.js

https://github.com/dcmjs-org/dcmjs/commit/b77c6423253421885b7b452b57b50845a2d35da3

https://github.com/dcmjs-org/dcmjs/commit/d92ffc4e6cf46b5143168660bba90e5c7cc8a89f

Since you have been able to isolate this issue do you see a solution?

richard-viney commented 1 year ago

I've also run into an issue in this area recently, and discovered that pydicom with certain libraries loaded doesn't reliably handle multiple fragments per frame in multi-frame DICOMs. Opened an issue for it here: https://github.com/pydicom/pydicom/issues/1774.

My understanding (which is not canonical) is that these fragments are supported in order to help split up very large pieces of compressed image/frame data, but exactly how this is meaningfully leveraged by implementations in practice I don't have direct experience of.

The relevant part of the spec is https://dicom.nema.org/dicom/2013/output/chtml/part05/sect_A.4.html.

In my experience, the general level of support for multi-fragment storage of DICOM image data, particularly multiple fragments per frame in e.g. a multi-frame ultrasound, seems patchy. For this reason, not using it seems reasonable unless one has a strong reason to want it (and frankly I've yet to come across one, but am keen to understand what the use cases are).

Why dcmjs turns fragments on by default, and uses a fragment size of 20 KiB which does seem small, I simply don't know. I did some PRs a while back that fixed dcmjs' support for multi-fragment files so it was in alignment with the DICOM spec, but no doubt there are libraries/implementations around that don't follow the spec or implement all parts of it, so it may be best to just turn off multi-fragment altogether via writeOptions.

I think I'd be in favour of making this the default in dcmjs, but in saying that I don't feel 100% confident that I understand the potential ramifications of doing that. Maybe in some cases with very large datasets it starts to matter, but also those people can always turn it on if they need it.

pieper commented 1 year ago

I don't have a lot of experience with this either, but I'm sure you are correct @richard-viney that various data or implementations will vary in their support and correctness.

I suspect this feature could come up in whole slide imaging use cases, e.g. in Slim.

At this point I'm wary of changing default behavior but if we have some example data and tests we could compare notes with other dicom developers and see what we think the most reasonable behavior should be.

richard-viney commented 1 year ago

For what it's worth, it looks like pydicom defaults to one fragment per frame:

https://github.com/pydicom/pydicom/blob/2709ff6ed5458bef0da778c56b9918c8c62c4292/pydicom/encaps.py#L658

dcmjs-org / dcmjs

US cine loops with Transfer Syntax RLE Lossless are corrupted after DicomMessage.write when fragmentMultiframe option is enabled #340