suever / pydicom-experimental

pydicom test
0 stars 1 forks source link

DataElement_from_raw exception in 0.9.8 #122

Open suever opened 9 years ago

suever commented 9 years ago

From barlen...@gmail.com on February 05, 2013 10:59:59

Hi,

I have written a module to hold the tags from a set of dicom files which worked with pydicom 0.9.7. (Thanks!) With 0.9.8,

value = dicom.dataelem.DataElement_from_raw(value)

throws (with debugging info appended):

TypeError: "'NoneType' object is unsubscriptable"

/usr/lib/python2.6/site-packages/dicom/values.py(191)convert_value() 190 # Text VRs use the 2nd specified encoding

--> 191 value = converter(byte_string, is_little_endian, encoding=encoding[1]) 192 elif VR != "SQ":

ipdb> byte_string 'head\ARIC_NCS_HUMAN_3T_SKYRA_VD11_D_MRQ\32 coil Human' ipdb> is_little_endian True ipdb> encoding ipdb> default_encoding 'iso8859' ipdb> encoding ipdb> u

/usr/lib/python2.6/site-packages/dicom/dataelem.py(338)DataElement_from_raw() 337 try: 2-> 338 value = convert_value(VR, raw, encoding) 339 except NotImplementedError as e:

ipdb> encoding ipdb> l 333 elif raw.tag.element == 0: # group length tag implied in versions < 3.0 334 VR = 'UL' 335 else: 336 raise KeyError("Unknown DICOM tag {0:s} - can't look up VR".format(str(raw.tag))) 337 try: 2-> 338 value = convert_value(VR, raw, encoding) 339 except NotImplementedError as e: 340 raise NotImplementedError("{0:s} in tag {1!r}".format(str(e), raw.tag)) 341 return DataElement(raw.tag, VR, value, raw.value_tell, 342 raw.length == 0xFFFFFFFF, already_converted=True)

What happened is that I loaded a DICOM file including the tag

(0010,4000) LT [head\ARIC_NCS_HUMAN_3T_SKYRA_VD11_D_MRQ\32 coil Human</... # 76, 1 PatientComments

dataelem.DataElement_from_raw's signature is DataElement_from_raw(raw_data_element, encoding=None) and calls values.convert_value with convert_value(VR, raw, encoding). The default encoding for values.convert_value is not None, however, and it can't deal with it.

I worked around this in my code by changing the DataElement_from_raw call to

from dicom.charset import default_encoding value = dicom.dataelem.DataElement_from_raw(value, default_encoding)

(not being enough of an encoding expert to know what else to use).

Thanks,

Rob Reid

Original issue: http://code.google.com/p/pydicom/issues/detail?id=122

suever commented 9 years ago

From Suever@gmail.com on February 05, 2013 08:41:38

Rob,

This was introduced in 0.9.8 as we strive toward python 3 compatibility. The biggest thing in python 3 is converting between raw bytes and string literals. We actually perform this conversion inside of DataElement_from_raw.

Internally when we use DataElement_from_raw, the second argument is the specific character set for the containing Dataset (Dataset._character_set). In many cases, the default_encoding should work (iso8859) but it will definitely fail in some.

All that being said. Our goal with 0.9.8 was to add python 3 compatibility without drastically altering the expected behavior in python 2. In python 2, we don't actually convert from string literals to unicode in DataElement_from_raw, so really the encoding parameter is unnecessary. However, that also means that the default value for the encoding parameter should not result in an exception. I'll have to think about what the default value should be (because we want to force python 3 users to enter a value still), but in the meantime as work-around, you can pass either ds._character_set or dicom.charset.default_encoding as the second parameter.

-Jonathan