pydicom / pydicom

Read, modify and write DICOM files with python code
https://pydicom.github.io/pydicom/dev
Other
1.88k stars 477 forks source link

Pixel Data is not encapsulated correctly #1320

Closed ravitej177 closed 3 years ago

ravitej177 commented 3 years ago

So, I want to view pixel data of dcm file(JPEG2000) using pydicom, But it throws me error when using GDCM handler that Pixel Data is not encapsulated correctly.

I have also used pillow handler even in that case it throws Unexpected tag '(0000, 0c00)' when parsing the Basic Table Offset item.

my code

import pydicom
import numpy as np
import gdcm
import PIL
from pydicom.pixel_data_handlers import pillow_handler, gdcm_handler
ds = pydicom.dcmread('/home/avicii/test.dcm')
print(ds.pixel_array)

Output:

Warning: In /build/gdcm-NQLsIX/gdcm-2.8.4/Source/MediaStorageAndFileFormat/gdcmPixmapReader.cxx, line 830, function bool gdcm::PixmapReader::ReadImageInternal(const gdcm::MediaStorage&, bool)
VOI LUT (0028,3010) are not handled. Image will not be displayed properly

Warning: In /build/gdcm-NQLsIX/gdcm-2.8.4/Source/MediaStorageAndFileFormat/gdcmJPEG2000Codec.cxx, line 499, function virtual bool gdcm::JPEG2000Codec::Decode(const gdcm::DataElement&, gdcm::DataElement&)
Pixel Data is not encapsulated correctly. Continuing anyway

[[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]

I'm able to access filemeta:

Dataset.file_meta -------------------------------
(0002, 0000) File Meta Information Group Length  UL: 64
(0002, 0002) Media Storage SOP Class UID         UI: Computed Radiography Image Storage
(0002, 0010) Transfer Syntax UID                 UI: JPEG 2000 Image Compression
-------------------------------------------------
(0008, 0008) Image Type                          CS: ['ORIGINAL', 'PRIMARY']
(0008, 0012) Instance Creation Date              DA: '20201118'
(0008, 0013) Instance Creation Time              TM: '083906'
(0008, 0016) SOP Class UID                       UI: Computed Radiography Image Storage
(0008, 0018) SOP Instance UID                    UI: 1.3.51.0.7.1244853527.15895.27456.36363.49614.20819.35058
(0008, 0020) Study Date                          DA: '20201118'
(0008, 0021) Series Date                         DA: '20201118'
(0008, 0022) Acquisition Date                    DA: '20201118'
(0008, 0023) Content Date                        DA: ''
(0008, 002a) Acquisition DateTime                DT: '20201118084040'
(0008, 0030) Study Time                          TM: '084040'
(0008, 0031) Series Time                         TM: '084040'
(0008, 0032) Acquisition Time                    TM: '084040'
(0008, 0033) Content Time                        TM: ''
(0008, 0050) Accession Number                    SH: ''
(0008, 0060) Modality                            CS: 'CR'
(0008, 0070) Manufacturer                        LO: 'Agfa'
(0008, 0080) Institution Name                    LO: 'VetRadNZ-TaurangaVeterinaryServicesLtd'
(0008, 0081) Institution Address                 ST: 'Katikati'
(0008, 0090) Referring Physician's Name          PN: ''
(0008, 1010) Station Name                        SH: 'TVKKCR01'
(0008, 1030) Study Description                   LO: 'Chest/ Abdomen/ Pelvis'
(0008, 103e) Series Description                  LO: 'Thorax  LAT'
(0008, 1040) Institutional Department Name       LO: ''
(0008, 1090) Manufacturer's Model Name           LO: 'CR10-X'
(0010, 0010) Patient's Name                      PN: 'Fudges Puppies^Wallis'
(0010, 0020) Patient ID                          LO: '287227A'
(0010, 0030) Patient's Birth Date                DA: ''
(0010, 0040) Patient's Sex                       CS: ''
(0010, 2201) Patient Species Description         LO: 'Dog'
(0010, 2292) Patient Breed Description           LO: 'shih tzu'
(0010, 2293)  Patient Breed Code Sequence  0 item(s) ---- 
(0010, 2294)  Breed Registration Sequence  0 item(s) ---- 
(0010, 2297) Responsible Person                  PN: ''
(0010, 2299) Responsible Organization            LO: ''
(0018, 0010) Contrast/Bolus Agent                LO: ''
(0018, 0015) Body Part Examined                  CS: 'CHEST'
(0018, 1000) Device Serial Number                LO: 'PB5151008837'
(0018, 1004) Plate ID                            LO: '100570730151'
(0018, 1008) Gantry ID                           LO: 'TVKKNX01'
(0018, 1020) Software Versions                   LO: 'ARC_2504'
(0018, 1164) Imager Pixel Spacing                DS: [0.1, 0.1]
(0018, 1260) Plate Type                          SH: 'Powder'
(0018, 1402) Cassette Orientation                CS: 'LANDSCAPE'
(0018, 1403) Cassette Size                       CS: '35CMX43CM'
(0018, 1411) Exposure Index                      DS: "19.0"
(0018, 1412) Target Exposure Index               DS: "958.0"
(0018, 1413) Deviation Index                     DS: "-17.0"
(0018, 5101) View Position                       CS: 'RL'
(0019, 0010) Private Creator                     LO: 'Agfa ADC NX'
(0019, 1007) Private tag data                    CS: 'YES'
(0019, 1021) Private tag data                    FL: 21.676000595092773
(0019, 1028) Private tag data                    CS: 'NO'
(0019, 10f6) [Plate Sensitivity]                 DS: "1070.0"
(0019, 10f7) [Plate Erasability]                 DS: "1000.0"
(0019, 10fa) Private tag data                    IS: "19"
(0019, 10fb) Private tag data                    FL: -17.0
(0019, 10fc) Private tag data                    IS: "958"
(0019, 10fd) Private tag data                    CS: 'NO'
(0019, 10fe) [Unknown]                           CS: 'MED'
(0020, 000d) Study Instance UID                  UI: 1.3.51.0.7.13567641530.18686.44101.44859.54446.35358.5633
(0020, 000e) Series Instance UID                 UI: 1.3.51.0.7.3314330875.9178.1869.44054.17266.49585.52114
(0020, 0010) Study ID                            SH: '2011180838270079'
(0020, 0011) Series Number                       IS: "1"
(0020, 0013) Instance Number                     IS: "1"
(0020, 0020) Patient Orientation                 CS: ['P', 'F']
(0020, 0060) Laterality                          CS: 'L'
(0028, 0002) Samples per Pixel                   US: 1
(0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
(0028, 0010) Rows                                US: 3420
(0028, 0011) Columns                             US: 4218
(0028, 0030) Pixel Spacing                       DS: [0.1, 0.1]
(0028, 0100) Bits Allocated                      US: 16
(0028, 0101) Bits Stored                         US: 16
(0028, 0102) High Bit                            US: 15
(0028, 0103) Pixel Representation                US: 0
(0028, 0106) Smallest Image Pixel Value          US: 0
(0028, 0107) Largest Image Pixel Value           US: 65535
(0028, 0300) Quality Control Image               CS: 'NO'
(0028, 0301) Burned In Annotation                CS: 'NO'
(0028, 1050) Window Center                       DS: "32767.5"
(0028, 1051) Window Width                        DS: "65535.0"
(0028, 1052) Rescale Intercept                   DS: "0.0"
(0028, 1053) Rescale Slope                       DS: "1.0"
(0028, 1054) Rescale Type                        LO: 'P-VALUES'
(0028, 2110) Lossy Image Compression             CS: '00'
(0028, 3010)  VOI LUT Sequence  1 item(s) ---- 
   (0028, 3002) LUT Descriptor                      US: [16384, 8192, 16]
   (0028, 3003) LUT Explanation                     LO: ''
   (0028, 3006) LUT Data                            US: Array of 16384 elements
   ---------
(0040, 0244) Performed Procedure Step Start Date DA: '20201118'
(0040, 0245) Performed Procedure Step Start Time TM: '084040'
(0507, 0010) Private Creator                     LO: 'GENESIS_DIGITAL_IMAGING_V3.1_IMAGE_INFO_GROUP'
(0507, 1001)  Private tag data  1 item(s) ---- 
   (0507, 1002) Private tag data                    LO: ''
   (0507, 1003) Private tag data                    LO: ''
   (0507, 1004) Private tag data                    LO: ''
   (0507, 1005) Private tag data                    LO: '0'
   (0507, 1006) Private tag data                    LO: '0'
   (0507, 1007) Private tag data                    LO: ''
   (0507, 1008) Private tag data                    LO: '0'
   ---------
(2050, 0020) Presentation LUT Shape              CS: 'INVERSE'
(7fe0, 0010) Pixel Data                          OW: Array of 721348 elements

I have used python2.7 and python3.6.9 it occurs in both the versions. How to reencapsulate the pixel data of the dcm file to avoid these errors

I'm attaching sample file here for the reference:

https://dl.dropboxusercontent.com/s/8xic5xlax58p0u0/test.dcm?dl=0

scaramallion commented 3 years ago

Hmm, there are a number of things wrong with the dataset:

  1. Uses an compressed transfer syntax but the pixel data has a defined length (should be undefined 0xFFFFFFFF)
  2. No encapsulation of compressed pixel data
  3. The J2K data includes a JP2 header

To fix... let's see

import matplotlib.pyplot as plt

from pydicom import dcmread
from pydicom.encaps import encapsulate

with dcmread("1320.dcm") as ds:
    data = ds.PixelData

    # Strip out the JP2 header, may not be necessary for GDCM but let's be conformant
    # This will only work if Number of Frames is 1
    offset = data.index(b"\xff\x4f")
    codestream = data[offset:]
    ds.PixelData = encapsulate([codestream])
    ds.save_as("1320_fixed.dcm")

with dcmread("1320_fixed.dcm") as ds:
    arr = ds.pixel_array
    plt.imshow(arr)
    plt.show()

Also I think that's the first time I've actually seen the animal part of DICOM being used.

ravitej177 commented 3 years ago

Hmm, there are a number of things wrong with the dataset:

  1. Uses an compressed transfer syntax but the pixel data has a defined length (should be undefined 0xFFFFFFFF)
  2. No encapsulation of compressed pixel data
  3. The J2K data includes a JP2 header

To fix... let's see

import matplotlib.pyplot as plt

from pydicom import dcmread
from pydicom.encaps import encapsulate

with dcmread("1320.dcm") as ds:
    data = ds.PixelData

    # Strip out the JP2 header, may not be necessary for GDCM but let's be conformant
    # This will only work if Number of Frames is 1
    offset = data.index(b"\xff\x4f")
    codestream = data[offset:]
    ds.PixelData = encapsulate([codestream])
    ds.save_as("1320_fixed.dcm")

with dcmread("1320_fixed.dcm") as ds:
    arr = ds.pixel_array
    plt.imshow(arr)
    plt.show()

Also I think that's the first time I've actually seen the animal part of DICOM being used.

Thanks @scaramallion, that worked like a charm. Stripping out JP2 Header from the pixel data made the image recognizable in all J2K handlers. Also, May I know why this script may not work for multi frame images? If it cannot handle multiframes, is there a way I can encapsulate multi frame images.

Also, not sure about the animal part of DICOM whether it can be used because I got random DICOM images to test out the functionality.

scaramallion commented 3 years ago

At least part of the reason for encapsulation of compressed data in DICOM is to ensure there's an easy way to determine where one compressed image codestream ends and another starts. So the problem is that if the JPEG (or other compressed data) codestream from multiple images is concatenated together there's no guaranteed way to split them into separate images without parsing the codestream itself.

For JPEG/JPEG-LS/JPEG 2000, there is one will-mostly-work way which involves looking for the end-of-image marker (0xFFD9) and using that to split them up. Then you can just use encapsulate() as usual: ds.PixelData = encapsulate([frame1, frame2, ...]). It may fail if that marker is present in the encoded image data within the codestream (JPEG uses an escape character beforehand so decoders treat it correctly when that happens).