carlosqueiroz / pydicom

Automatically exported from code.google.com/p/pydicom
0 stars 0 forks source link

Need option for trusting (0002, 0000) Group Length or better heuristic than not_group2 #139

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I have some dicom files that cannot be read by pydicom 0.9.8 due to the 
introduced heuristic for handling improper group lengths. The header of the 
file looks like this:

# Dicom-File-Format

# Dicom-Meta-Information-Header
# Used TransferSyntax: LittleEndianExplicit
(0002,0000) UL 202                                      #   4, 1 
MetaElementGroupLength
(0002,0001) OB 00\01                                    #   2, 1 
FileMetaInformationVersion
(0002,0002) UI =f #  30, 1 MediaStorageSOPClassUID
(0002,0003) UI [1.2.826.0.1.3680043.8.760.7.1553278726.1392203754637.31] #  56, 
1 MediaStorageSOPInstanceUID
(0002,0010) UI =LittleEndianImplicit                    #  18, 1 
TransferSyntaxUID
(0002,0012) UI [1.2.276.0.7230010.3.0.3.6.1]            #  28, 1 
ImplementationClassUID
(0002,0013) SH [OFFIS_DCMTK_361]                        #  16, 1 
ImplementationVersionName

# Dicom-Data-Set
# Used TransferSyntax: LittleEndianImplicit
(0002,0003) UI [1.2.826.XXXX] #  56, 1 MediaStorageSOPInstanceUID
(0008,0005) CS [ISO_IR 100]                             #  10, 1 
SpecificCharacterSet
(0008,0008) CS [ORIGINAL\PRIMARY\TOMO_PROJ\RIGHT]       #  32, 4 ImageType
(0008,0016) UI =DigitalMammographyXRayImageStorageForProcessing #  30, 1 
SOPClassUID

using pydicom out of the box it returns a single tag:
(2e31, 2e32) Private tag data                    OB: Array of 20188336 bytes

interpreting the start of MediaStorageSOPInstanceUID as a group, element 
identifier ("1.2." -> (2e31, 2e32)). And I do get the warning:

      logger.info("*** Group length for file meta dataset "
                            "did not match end of group 2 data ***")

when debugging.

In filereader l. 463 fp_now is 354, but expected_ds_start is 346. If I manually 
set the filepointer to fp.seek(346), the file is read correctly.
If not you see the above erroneous result.

If I understand this correctly then the issue arises because the dicom-meta 
information is in explicit VR wheras here the actual data-set is implicit VR. 
Since the producer considers (0002, 0003) as part of the dataset, it is in 
implicit VR. Pydicom however thinks due to the not_group2 heuristic, that is is 
part of the dicom-meta-header in explicit VR.

Original issue reported on code.google.com by maddin@gmail.com on 20 Feb 2014 at 2:47

GoogleCodeExporter commented 9 years ago
I created a small patch that solves it for me with a new heuristic.

Original comment by maddin@gmail.com on 20 Feb 2014 at 3:31

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for the detailed investigation and the patch.  I'm fairly sure these 
files are not compliant with the DICOM standard (for one thing tags must be in 
numeric order, but perhaps that applies to the file meta and the main dataset 
separately). I feel inclined to solve this by offering the option to trust the 
group length as mentioned in you issue subject line. I'll give it some more 
thought and try to put something in place for a near version.

Out of curiosity, the second MediaStorageSOPInstanceUID you show has 
'1.2.826.XXXX'. Is that really what the value is or is that some kind of 
shorthand used in the display? If the former, then it would look like the 
creating program didn't clean up a placeholder of some kind.

Original comment by darcymason@gmail.com on 23 Feb 2014 at 9:43