Inability to write Dataset to file after decoding

GoogleCodeExporter commented 9 years ago

If the user decodes the data elements in a dataset, string and Person Name 
elements are converted to unicode. When the user attempts to write this dataset 
to file, unicode encoding isn't handled explicitly so python attempts to 
implicitly convert to ascii (or another encoding depending upon the user's 
environment settings). The following code demonstrates this issue:

import dicom
from dicom.dataelem import DataElement
from dicom.charset import decode
from dicom.filewriter import write_string

data_element = DataElement((0x08,0x70),'SH','Suéver')
decode(data_element, None)  # Decodes using default_encoding
write_string(open('/dev/null','wb'), data_element)

NOTE: The é character is not a valid character in ascii therefore write_string 
will throw an ambiguous UnicodeEncodeError.

This handling of unicode needs to not only be present in 
filewriter.write_string but also filewriter.write_PN. The handling of 
PersonName elements should be different due to the fact that PN values use 
multiple encodings.

I have already begun work on a patch for this, but just wanted to document the 
issue here in case it crops up in the future.

Original issue reported on code.google.com by Suever@gmail.com on 24 Nov 2012 at 4:28

GoogleCodeExporter commented 9 years ago

Original comment by Suever@gmail.com on 24 Nov 2012 at 4:40

Changed state: Started
Added labels: Difficulty-Easy

GoogleCodeExporter commented 9 years ago

Original comment by Suever@gmail.com on 25 Nov 2012 at 12:30

Changed state: Fixed

spidersaint / pydicom

Inability to write Dataset to file after decoding #118