Enet4 / dicom-rs

Rust implementation of the DICOM standard
https://dicom-rs.github.io
Apache License 2.0
416 stars 81 forks source link

Error while accessing PixelData bytes #336

Closed scandav closed 1 year ago

scandav commented 1 year ago

Hi, I am following the example reported in the crate docs.

use dicom::core::Tag;
use dicom::object::{open_file, Result};

let obj = open_file("0001.dcm")?;
let pixel_data_bytes = obj.element(Tag(0x7FE0, 0x0010))?.to_bytes()?;

I am getting the following error: bad value cast: requested bytes but value is PixelSequence

Dicom file from here: IMG00039_38464LAD_LAO.dcm.zip

Enet4 commented 1 year ago

Hello! Since that instance is in JPEG Lossless, the pixel data is encoded as a sequence of pixel data fragments. That behavior is by design, because to_bytes only works for non-sequence values.

If you intend to retrieve the raw data within the fragments comprising the frames, you would instead retrieve the items of the pixel data sequence (.items()). Note that dicom-pixeldata provides a higher level API for decoding the pixel data, regardless of transfer syntax.

scandav commented 1 year ago

Thank you for your answer, understood.

As I have to compute a checksum on the image content of a set of dicoms, I would not need to decode the images with dicom-pixeldata - which is by the way the solution I have implemented so far.

let pixel_digest = md5::compute(obj.decode_pixel_data()?.data())

I was wondering whether there was an option to retrieve the raw bytes of the bulk data as read from the file. That would make it much faster.

Update: I have tried with .items() as you suggested above but it returns None

Enet4 commented 1 year ago

Update: I have tried with .items() as you suggested above but it returns None

Sorry for the confusion, it would be .value().fragments() instead. items() provides data set items only. This should hopefully be enough to compute a digest of all raw fragment bytes. They do not even have to be collected into a flat array if you use a digest Context.


Under internal consideration, there might be room for an extension to the dicom-object API that would support multi-step loading of DICOM data sets, which would also let the consumer know the position where certain elements start and revert to plain reader cursors on larger values.

scandav commented 1 year ago

Sorry for the confusion, it would be .value().fragments() instead.

I have tried what you suggested but it seems like the Value::PixelSequence has no fragments method. It seems to have only a private fragments attribute.

https://github.com/Enet4/dicom-rs/blob/c1644cf355f160946f947273bb89283546be2c98/core/src/value/mod.rs#L66-L71

Enet4 commented 1 year ago

Hmm right, the method is still not available upstream, but since the field is inside a public enum variant, it is not private. It just needs some pattern matching to retrieve it right now:

match obj.element(tags::PIXEL_DATA)?.value() {
    DicomValue::Primitive(value) => {
        todo!("value.to_bytes() works here")
    },
    DicomValue::PixelSequence { fragments, .. } => {
        todo!("fragments available")
    },
    DicomValue::Sequence { .. } => panic!("illegal value type for pixel data"),
}
scandav commented 1 year ago

It works - and digest on fragments is indeed 30x faster than decoding the pixel data! Thanks a lot for your work! 😃 💯

Enet4 commented 1 year ago

Glad I could be of assistance! I will be closing this issue, but this exchange has certainly brought some ideas which can derive into more contributions in the future.