Closed matthew-brett closed 2 years ago
OK - I did investigate further. It boils down to the fact that the raw private data element in my file is of type OB
, but dicom_parser
will only parse it if it is of type UN
- https://github.com/open-dicom/dicom_parser/blob/main/src/dicom_parser/utils/vr_to_data_element.py#L84
This function shows the problem.
from dicom_parser.utils.read_file import read_file
from dicom_parser.header import Header
from dicom_parser.utils.vr_to_data_element import get_data_element_class
CSA_SERIES_INFO_KEY: str = "CSASeriesHeaderInfo"
def what_is_csa(pth):
res = read_file(pth)
hdr = Header(res)
csa_info = hdr.get(CSA_SERIES_INFO_KEY)
print('CSA info type', type(csa_info))
# This boils down to
tag = hdr.get_private_tag(CSA_SERIES_INFO_KEY)
raw_element = hdr.get_raw_element(tag)
print('VR', raw_element.VR)
el_class = get_data_element_class(raw_element.VR)
print('Data element class', el_class)
data_element = el_class(raw_element)
print('Data element value type', type(data_element.value))
For my file that gives:
CSA info type <class 'bytes'>
VR OB
Data element class <class 'dicom_parser.data_elements.other_byte.OtherByte'>
Data element value type <class 'bytes'>
For tests/files/ep2d_image.dcm
it gives:
CSA info type <class 'dict'>
VR UN
Data element class <class 'dicom_parser.data_elements.private_data_element.PrivateDataElement'>
Data element value type <class 'dict'>
This is source of the original error when fetching the data.
So - I think there needs to be some way to recognize that OB
is also an acceptable VR for the CSA header ...
I wonder if now is the time to start using Pooch so we can dump func_vol.dcm
into the test suite?
When I run this:
[ins] In [36]: from dicom_parser import Image
[ins] In [38]: Image('func_vol.dcm').data
I get:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [38], in <cell line: 1>()
----> 1 Image('func_vol.dcm').data
File ~/dev_trees/dicom_parser/src/dicom_parser/image.py:578, in Image.data(self)
568 """
569 Returns the pixel data array after having applied any required
570 transformations.
(...)
575 Pixel data array
576 """
577 if self._data is not None:
--> 578 return self.fix_data()
File ~/dev_trees/dicom_parser/src/dicom_parser/image.py:119, in Image.fix_data(self)
117 data = self._data
118 if self.is_mosaic:
--> 119 data = self.mosaic.fold()
120 if self.is_multi_frame:
121 data = self.multi_frame.get_data()
File ~/dev_trees/dicom_parser/src/dicom_parser/image.py:706, in Image.mosaic(self)
697 """
698 Returns mosaic information.
699
(...)
703 Mosaic encoded image information
704 """
705 if self.is_mosaic and self._mosaic is None:
--> 706 self._mosaic = Mosaic(self._data, self.header)
707 return self._mosaic
File ~/dev_trees/dicom_parser/src/dicom_parser/utils/siemens/mosaic.py:45, in Mosaic.__init__(self, mosaic_array, header)
42 self.series_header_info = self.header.get(self.CSA_SERIES_INFO_KEY)
44 # Number of images encoded in the mosaic.
---> 45 self.n_images = self.get_n_images()
47 # Number of rows and columns that make up the mosaic.
48 self.size = int(np.ceil(np.sqrt(self.n_images)))
File ~/dev_trees/dicom_parser/src/dicom_parser/utils/siemens/mosaic.py:72, in Mosaic.get_n_images(self)
63 """
64 Returns the number of images encoded in the mosaic.
65
(...)
69 Number of images encoded in mosaic
70 """
71 try:
---> 72 return self.header["NumberOfImagesInMosaic"]
73 except KeyError:
74 raise KeyError(messages.MISSING_NUMBER_OF_IMAGES)
File ~/dev_trees/dicom_parser/src/dicom_parser/header.py:132, in Header.__getitem__(self, key)
118 def __getitem__(self, key: Union[str, tuple, list]) -> Any:
119 """
120 Provide dictionary like indexing-operator functionality.
121
(...)
130 Parsed header information of the given key or keys
131 """
--> 132 return self.get(key, missing_ok=False)
File ~/dev_trees/dicom_parser/src/dicom_parser/header.py:550, in Header.get(self, tag_or_keyword, default, parsed, missing_ok, as_json)
548 try:
549 if isinstance(tag_or_keyword, (str, tuple)):
--> 550 value = get_method(tag_or_keyword)
551 elif isinstance(tag_or_keyword, list):
552 value = {
553 item: self.get(
554 item,
(...)
560 for item in tag_or_keyword
561 }
File ~/dev_trees/dicom_parser/src/dicom_parser/header.py:476, in Header.get_parsed_value(self, tag_or_keyword)
474 return value
475 else:
--> 476 return data_element.value
File ~/dev_trees/dicom_parser/src/dicom_parser/data_element.py:179, in DataElement.value(self)
170 """
171 Caches the parsed value or values of this instance.
172
(...)
176 This instance's parsed value or values
177 """
178 if self._value is None:
--> 179 self._value = self.parse_values()
180 return self._value
File ~/dev_trees/dicom_parser/src/dicom_parser/data_element.py:134, in DataElement.parse_values(self)
132 if self.value_multiplicity > 1:
133 return tuple(self.parse_value(value) for value in self.raw.value)
--> 134 return self.parse_value(self.raw.value)
File ~/dev_trees/dicom_parser/src/dicom_parser/data_elements/private_data_element.py:81, in PrivateDataElement.parse_value(self, value)
79 method: FunctionType = self.definition.get("method")
80 if method:
---> 81 return method(self.raw.value)
83 # Try to decode
84 elif isinstance(self.raw.value, bytes):
File ~/dev_trees/dicom_parser/src/dicom_parser/utils/siemens/private_tags.py:62, in parse_siemens_number_of_slices_in_mosaic(value)
61 def parse_siemens_number_of_slices_in_mosaic(value: bytes) -> int:
---> 62 return int.from_bytes(value, byteorder="little")
TypeError: cannot convert 'int' object to bytes
I see that the underlying problem is:
[ins] In [34]: pydicom.read_file('func_vol.dcm')[tag]
Out[34]: (0019, 100a) [NumberOfImagesInMosaic] US: 28
[ins] In [35]: pydicom.read_file('tests/files/ep2d_image.dcm')[tag]
Out[35]: (0019, 100a) [NumberOfImagesInMosaic] UN: b'<\x00'
Could parse_siemens_number_of_slices_in_mosaic
be adapted to accept an int
as well as bytes
?
By the way - in this file, all the private tags seem to have standard DICOM types, not UN:
[ins] In [20]: from dicom_parser.data_elements.private_data_element import (
...: TAG_TO_DEFINITION as PRIVATE_TAG_TO_DEFINITION,
...: )
raw = pydicom.read_file('func_vol.dcm')
[ins] In [15]: for tag in PRIVATE_TAG_TO_DEFINITION:
...: print(raw.get(tag))
...:
(0019, 100a) [NumberOfImagesInMosaic] US: 28
(0019, 100b) [SliceMeasurementDuration] DS: "22.5"
None
None
None
(0019, 1028) [BandwidthPerPixelPhaseEncode] FD: 43.403
(0019, 1029) [MosaicRefAcqTimes] FD: Array of 28 elements
(0029, 1010) [CSA Image Header Info] OB: Array of 12336 elements
(0029, 1020) [CSA Series Header Info] OB: Array of 137968 elements
What about something like this?
diff --git a/src/dicom_parser/utils/siemens/private_tags.py b/src/dicom_parser/utils/siemens/private_tags.py
index 14eca1b..3db3fa0 100644
--- a/src/dicom_parser/utils/siemens/private_tags.py
+++ b/src/dicom_parser/utils/siemens/private_tags.py
@@ -58,8 +58,9 @@ def parse_siemens_gradient_direction(value: bytes) -> list:
return [float(value) for value in list(array.array("d", value))]
-def parse_siemens_number_of_slices_in_mosaic(value: bytes) -> int:
- return int.from_bytes(value, byteorder="little")
+def parse_siemens_number_of_slices_in_mosaic(value: [int, bytes]) -> int:
+ return (value if isinstance(value, int) else
+ int.from_bytes(value, byteorder="little"))
def parse_siemens_b_matrix(value: bytes) -> np.ndarray:
@@ -185,8 +186,10 @@ def nearest_pos_semi_def(b_matrix: np.ndarray):
return np.dot(vecs, np.dot(np.diag(scalers), vecs.T))
-def parse_siemens_bandwith_per_pixel_phase_encode(value: bytes):
- return array.array("d", value)[0]
+def parse_siemens_bandwith_per_pixel_phase_encode(
+ value: [float, bytes]):
+ return (value if isinstance(value, float) else
+ array.array("d", value)[0])
I think I might have something a little more general in mind. I'll give it a go and update. Pooch looks perfect.
Then:
Investigating a little, it appears that the code is expecting that the contents of
self.series_header_info
is a parsed CSA structure, but in this case, it is just a sequence of bytes, hence the error.I'm sorry, I would have investigated more, but it's 00:30 here, and I have a grant report deadline tomorrow!