suever / pydicom-experimental

pydicom test
0 stars 1 forks source link

read_file() returns incomplete dataset for DICOM file with nested private sequences #113

Closed suever closed 9 years ago

suever commented 9 years ago

From d.j.hun...@gmail.com on February 29, 2012 13:49:58

I have a DICOM file that contains a couple of private sequences of undefined length, which themselves contain undefined length sequences nested within them. The transfer syntax is implicit VR. When I attempt to read this file, many of the data elements (including the pixel data) are missing from returned dataset.

Looking at the code, the problem appears to originate when data_element_generator() reaches a private sequence whose VR is unknown. Under this condition, the sequence is treated as binary data of undefined length and read_undefined_length_value() is called, which parses the file until a sequence delimiter tag is reached. However, in the case of nested sequences, the next sequence delimiter to be reached corresponds to the end of the first nested sequence rather than that of the parent sequence.

As such, the parent sequence is only partially read, and the rest of the sequence is parsed as if it is the top-level dataset. When the parent’s actual sequence delimiter is reached it is detected by read_dataset(), ultimately causing read_file() to terminate early.

As a workaround, I’ve modified data_element_generator() to check all data elements of undefined length to see if they are sequences (based on the assumption that the first four bytes of an SQ data element value will be always be an Item Tag):

--- a/src/oxmorf/dicom/filereader.py +++ b/src/oxmorf/dicom/filereader.py @@ -247,7 +247,13 @@ VR = dictionaryVR(tag) except KeyError: pass

The hope is that this should prevent any sequences being read as binary data (it seems to work ok so far, although I've not properly tested it).

If needed, I should shortly be able to provide the DICOM file in question.

Many thanks,

David

Original issue: http://code.google.com/p/pydicom/issues/detail?id=113

suever commented 9 years ago

From darcymason@gmail.com on February 29, 2012 19:31:04

Thanks for the detailed report, and yes, a file would be helpful (as always, with no confidential information of any kind).

I think I will leave this until after the 0.9.7 release (after which pydicom move towards python 3), and backport the solution to the python 2.x branch, with thorough testing in place.

Status: Accepted
Labels: Milestone-Release1.0

suever commented 9 years ago

From d.j.hun...@gmail.com on March 05, 2012 06:18:46

Here's the offending file.

David

Attachment: nestedPrivateSQ.dcm

suever commented 9 years ago

From Suever@gmail.com on June 13, 2012 12:13:07

This issue was closed by revision 84af4b240add .

Status: Fixed

suever commented 9 years ago

From Suever@gmail.com on June 13, 2012 12:15:31

David,

I was able to come up with a patch based on your suggestion. The one modification I made was to set the VR to 'SQ' if an item tag was indeed found. This allows proper parsing of the file as well as proper formatting while printing the sequences.

Additionally, I created a very simple example file as well as a unittest.

-Suever