google-code-export / pydicom

Automatically exported from code.google.com/p/pydicom
0 stars 0 forks source link

Conversion of raw unicode values in Dataset initialization fails (IronPython) #115

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Use following Python code and attached dataset to reproduce

        f1 = 'unicodeST.dcm'
        ds1 = dicom.read_file(f1, stop_before_pixels=True)
        string = str(ds1)

An exception occurs when mapping raw data to dict leaving the Dataset in an 
incomplete and invalid state.

What version of the product are you using?
Pydicom 0.9.7
IronPython 2.7

***NOTE***: any text or attached files posted with the issue can be viewed
by anyone. You are solely responsible to ensure that they contain no
confidential information of any kind.

Please provide any additional information below.

Issue results from raw unicode data being typed as, 'str'.  Then the conversion 
to str fails with an unhandled exception.

I don't have patch tool.  below is modified valuerep.py:MultiString 
implementation to resolve issue.

def MultiString(val, valtype=str):
    """Split a string by delimiters if there are any

    val -- DICOM string to split up
    valtype -- default str, but can be e.g. UID to overwrite to a specific type
    """
    # Remove trailing blank used to pad to even length
    #2005.05.25: also check for trailing 0, error made in PET files we are converting
    if val and (val.endswith(' ') or val.endswith('\x00')):
        val = val[:-1]

    # XXX --> simpler version python > 2.4   splitup = [valtype(x) if x else x for x in val.split("\\")]
    splitup = []
    for subval in val.split("\\"):
        if subval:
            if isinstance(val, unicode):
                splitup.append(unicode(subval))
            else:
                splitup.append(valtype(subval))
        else:
            splitup.append(subval)
    if len(splitup) == 1:
        return splitup[0]
    else:
        return MultiValue(valtype, splitup)

Original issue reported on code.google.com by j...@computer.org on 23 Apr 2012 at 4:02

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for the issue and sample file. However, on python 2.7, I don't see this 
error after running the steps to generate the problem. In regular python, val 
should never be unicode. A quick search shows that (at least some time ago) 
IronPython did everything in unicode, so the error is probably unique to 
IronPython.

Hmmm... so what to do... I'm not convinced this is the only place where unicode 
would cause a problem. Even in the python 3 version of pydicom, I expect val 
will come into MultiString as bytes (but probably will be converted inside).

I'll think about this. Meanwhile, the code fix above may not be quite right -- 
it doesn't check valtype before dealing with the unicode. As noted in the 
comments, valtype could be something like UID. Is it str(val) that is causing 
the problem when val is unicode? If so, then a solution that should work would 
be just to change valtype from str if necessary. This check after the trailing 
blank part should work:

if isinstance(val, unicode) and valtype == str:
   valtype = unicode    # or even a null function that does nothing should work

That would also work with the new list comprehension line for python >2.4.

Original comment by darcymason@gmail.com on 24 Apr 2012 at 12:56