open-dicom / dicom_parser

Facilitates DICOM data access.
https://dicom-parser.readthedocs.io/en/latest/?badge=latest
MIT License
26 stars 15 forks source link

Not-parsed CSA header breaking read of data from functional image #91

Closed matthew-brett closed 2 years ago

matthew-brett commented 2 years ago
curl -L https://github.com/matthew-brett/insular-dicoms/raw/main/DICOM_1/TAP1/FraDam_27062013_016%20-0003-0001-00001.dcm -o func_vol.dcm

Then:

[ins] In [1]: from dicom_parser import Image

[ins] In [2]: img = Image('func_vol.dcm')

[ins] In [3]: data = img.data
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-10cd04492345> in <module>
----> 1 data = img.data

~/Library/Python/3.9/lib/python/site-packages/dicom_parser/image.py in data(self)
    576         """
    577         if self._data is not None:
--> 578             return self.fix_data()
    579 
    580     @property

~/Library/Python/3.9/lib/python/site-packages/dicom_parser/image.py in fix_data(self)
    117         data = self._data
    118         if self.is_mosaic:
--> 119             data = self.mosaic.fold()
    120         if self.is_multi_frame:
    121             data = self.multi_frame.get_data()

~/Library/Python/3.9/lib/python/site-packages/dicom_parser/image.py in mosaic(self)
    704         """
    705         if self.is_mosaic and self._mosaic is None:
--> 706             self._mosaic = Mosaic(self._data, self.header)
    707         return self._mosaic
    708 

~/Library/Python/3.9/lib/python/site-packages/dicom_parser/utils/siemens/mosaic.py in __init__(self, mosaic_array, header)
     50         # Read the ASCII (ASCCONV) header encoded withing the series CSA
     51         # header.
---> 52         self.ascii_header = self.series_header_info.get(
     53             self.CSA_ASCII_HEADER_KEY, {}
     54         ).get("value", {})

AttributeError: 'bytes' object has no attribute 'get'

Investigating a little, it appears that the code is expecting that the contents of self.series_header_info is a parsed CSA structure, but in this case, it is just a sequence of bytes, hence the error.

I'm sorry, I would have investigated more, but it's 00:30 here, and I have a grant report deadline tomorrow!

matthew-brett commented 2 years ago

OK - I did investigate further. It boils down to the fact that the raw private data element in my file is of type OB, but dicom_parser will only parse it if it is of type UN - https://github.com/open-dicom/dicom_parser/blob/main/src/dicom_parser/utils/vr_to_data_element.py#L84

This function shows the problem.

from dicom_parser.utils.read_file import read_file
from dicom_parser.header import Header
from dicom_parser.utils.vr_to_data_element import get_data_element_class

CSA_SERIES_INFO_KEY: str = "CSASeriesHeaderInfo"

def what_is_csa(pth):
    res = read_file(pth)
    hdr = Header(res)
    csa_info = hdr.get(CSA_SERIES_INFO_KEY)
    print('CSA info type', type(csa_info))

    # This boils down to
    tag = hdr.get_private_tag(CSA_SERIES_INFO_KEY)
    raw_element = hdr.get_raw_element(tag)

    print('VR', raw_element.VR)

    el_class = get_data_element_class(raw_element.VR)
    print('Data element class', el_class)

    data_element = el_class(raw_element)
    print('Data element value type', type(data_element.value))

For my file that gives:

CSA info type <class 'bytes'>
VR OB
Data element class <class 'dicom_parser.data_elements.other_byte.OtherByte'>
Data element value type <class 'bytes'>

For tests/files/ep2d_image.dcm it gives:

CSA info type <class 'dict'>
VR UN
Data element class <class 'dicom_parser.data_elements.private_data_element.PrivateDataElement'>
Data element value type <class 'dict'>

This is source of the original error when fetching the data.

So - I think there needs to be some way to recognize that OB is also an acceptable VR for the CSA header ...

matthew-brett commented 2 years ago

I wonder if now is the time to start using Pooch so we can dump func_vol.dcm into the test suite?

When I run this:

[ins] In [36]: from dicom_parser import Image
[ins] In [38]: Image('func_vol.dcm').data

I get:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [38], in <cell line: 1>()
----> 1 Image('func_vol.dcm').data

File ~/dev_trees/dicom_parser/src/dicom_parser/image.py:578, in Image.data(self)
    568 """
    569 Returns the pixel data array after having applied any required
    570 transformations.
   (...)
    575     Pixel data array
    576 """
    577 if self._data is not None:
--> 578     return self.fix_data()

File ~/dev_trees/dicom_parser/src/dicom_parser/image.py:119, in Image.fix_data(self)
    117 data = self._data
    118 if self.is_mosaic:
--> 119     data = self.mosaic.fold()
    120 if self.is_multi_frame:
    121     data = self.multi_frame.get_data()

File ~/dev_trees/dicom_parser/src/dicom_parser/image.py:706, in Image.mosaic(self)
    697 """
    698 Returns mosaic information.
    699 
   (...)
    703     Mosaic encoded image information
    704 """
    705 if self.is_mosaic and self._mosaic is None:
--> 706     self._mosaic = Mosaic(self._data, self.header)
    707 return self._mosaic

File ~/dev_trees/dicom_parser/src/dicom_parser/utils/siemens/mosaic.py:45, in Mosaic.__init__(self, mosaic_array, header)
     42 self.series_header_info = self.header.get(self.CSA_SERIES_INFO_KEY)
     44 # Number of images encoded in the mosaic.
---> 45 self.n_images = self.get_n_images()
     47 # Number of rows and columns that make up the mosaic.
     48 self.size = int(np.ceil(np.sqrt(self.n_images)))

File ~/dev_trees/dicom_parser/src/dicom_parser/utils/siemens/mosaic.py:72, in Mosaic.get_n_images(self)
     63 """
     64 Returns the number of images encoded in the mosaic.
     65 
   (...)
     69     Number of images encoded in mosaic
     70 """
     71 try:
---> 72     return self.header["NumberOfImagesInMosaic"]
     73 except KeyError:
     74     raise KeyError(messages.MISSING_NUMBER_OF_IMAGES)

File ~/dev_trees/dicom_parser/src/dicom_parser/header.py:132, in Header.__getitem__(self, key)
    118 def __getitem__(self, key: Union[str, tuple, list]) -> Any:
    119     """
    120     Provide dictionary like indexing-operator functionality.
    121 
   (...)
    130         Parsed header information of the given key or keys
    131     """
--> 132     return self.get(key, missing_ok=False)

File ~/dev_trees/dicom_parser/src/dicom_parser/header.py:550, in Header.get(self, tag_or_keyword, default, parsed, missing_ok, as_json)
    548 try:
    549     if isinstance(tag_or_keyword, (str, tuple)):
--> 550         value = get_method(tag_or_keyword)
    551     elif isinstance(tag_or_keyword, list):
    552         value = {
    553             item: self.get(
    554                 item,
   (...)
    560             for item in tag_or_keyword
    561         }

File ~/dev_trees/dicom_parser/src/dicom_parser/header.py:476, in Header.get_parsed_value(self, tag_or_keyword)
    474             return value
    475 else:
--> 476     return data_element.value

File ~/dev_trees/dicom_parser/src/dicom_parser/data_element.py:179, in DataElement.value(self)
    170 """
    171 Caches the parsed value or values of this instance.
    172 
   (...)
    176     This instance's parsed value or values
    177 """
    178 if self._value is None:
--> 179     self._value = self.parse_values()
    180 return self._value

File ~/dev_trees/dicom_parser/src/dicom_parser/data_element.py:134, in DataElement.parse_values(self)
    132 if self.value_multiplicity > 1:
    133     return tuple(self.parse_value(value) for value in self.raw.value)
--> 134 return self.parse_value(self.raw.value)

File ~/dev_trees/dicom_parser/src/dicom_parser/data_elements/private_data_element.py:81, in PrivateDataElement.parse_value(self, value)
     79 method: FunctionType = self.definition.get("method")
     80 if method:
---> 81     return method(self.raw.value)
     83 # Try to decode
     84 elif isinstance(self.raw.value, bytes):

File ~/dev_trees/dicom_parser/src/dicom_parser/utils/siemens/private_tags.py:62, in parse_siemens_number_of_slices_in_mosaic(value)
     61 def parse_siemens_number_of_slices_in_mosaic(value: bytes) -> int:
---> 62     return int.from_bytes(value, byteorder="little")

TypeError: cannot convert 'int' object to bytes

I see that the underlying problem is:

[ins] In [34]: pydicom.read_file('func_vol.dcm')[tag]
Out[34]: (0019, 100a) [NumberOfImagesInMosaic]            US: 28

[ins] In [35]: pydicom.read_file('tests/files/ep2d_image.dcm')[tag]
Out[35]: (0019, 100a) [NumberOfImagesInMosaic]            UN: b'<\x00'

Could parse_siemens_number_of_slices_in_mosaic be adapted to accept an int as well as bytes?

By the way - in this file, all the private tags seem to have standard DICOM types, not UN:

[ins] In [20]: from dicom_parser.data_elements.private_data_element import (
          ...:     TAG_TO_DEFINITION as PRIVATE_TAG_TO_DEFINITION,
          ...: )
raw = pydicom.read_file('func_vol.dcm')
[ins] In [15]: for tag in PRIVATE_TAG_TO_DEFINITION:
          ...:     print(raw.get(tag))
          ...: 
(0019, 100a) [NumberOfImagesInMosaic]            US: 28
(0019, 100b) [SliceMeasurementDuration]          DS: "22.5"
None
None
None
(0019, 1028) [BandwidthPerPixelPhaseEncode]      FD: 43.403
(0019, 1029) [MosaicRefAcqTimes]                 FD: Array of 28 elements
(0029, 1010) [CSA Image Header Info]             OB: Array of 12336 elements
(0029, 1020) [CSA Series Header Info]            OB: Array of 137968 elements
matthew-brett commented 2 years ago

What about something like this?

diff --git a/src/dicom_parser/utils/siemens/private_tags.py b/src/dicom_parser/utils/siemens/private_tags.py
index 14eca1b..3db3fa0 100644
--- a/src/dicom_parser/utils/siemens/private_tags.py
+++ b/src/dicom_parser/utils/siemens/private_tags.py
@@ -58,8 +58,9 @@ def parse_siemens_gradient_direction(value: bytes) -> list:
     return [float(value) for value in list(array.array("d", value))]

-def parse_siemens_number_of_slices_in_mosaic(value: bytes) -> int:
-    return int.from_bytes(value, byteorder="little")
+def parse_siemens_number_of_slices_in_mosaic(value: [int, bytes]) -> int:
+    return (value if isinstance(value, int) else
+            int.from_bytes(value, byteorder="little"))

 def parse_siemens_b_matrix(value: bytes) -> np.ndarray:
@@ -185,8 +186,10 @@ def nearest_pos_semi_def(b_matrix: np.ndarray):
     return np.dot(vecs, np.dot(np.diag(scalers), vecs.T))

-def parse_siemens_bandwith_per_pixel_phase_encode(value: bytes):
-    return array.array("d", value)[0]
+def parse_siemens_bandwith_per_pixel_phase_encode(
+    value: [float, bytes]):
+    return (value if isinstance(value, float) else
+            array.array("d", value)[0])
ZviBaratz commented 2 years ago

I think I might have something a little more general in mind. I'll give it a go and update. Pooch looks perfect.