suever / pydicom-experimental

pydicom test
0 stars 1 forks source link

Is this a bug in write_file? #135

Open suever opened 9 years ago

suever commented 9 years ago

From bton...@gmail.com on November 09, 2013 13:17:27

What steps will reproduce the problem? (code stub attached)

  1. Modify a valid dvh value: ds1.DVHs[0].DVHData[0] = NewFirstPoint
  2. Save using ds1.write_file, and read back again into new ds2
  3. Read-in data is not updated correctly; ds1.DVHs[0].DVHData[0] <> ds2.DVHS[0].DVHData[0]. The data stored reflects the ORIGINAL data in ds1 (which means that write_file is getting the data from a shadowed copy of the data? Subtle.) What is the expected output? What do you see instead? Sample code attached. Used a valid RT Dose file with DVH's as a starting point. Later, tried creating DVH's from scratch, and found a similar behavior. Once DVHData is appended to DVHs[index], elements of DVHData that are changed programmatically are not saved with write_file. Is this a bug, or have I missed a step? What version of the product are you using? Tried this with both 0.9.6 and 0.9.7 _NOTE_: any text or attached files posted with the issue can be viewed by anyone. You are solely responsible to ensure that they contain no confidential information of any kind. Please provide any additional information below. CODE STUB ATTACHED

Attachment: dvh_rw_error.py

Original issue: http://code.google.com/p/pydicom/issues/detail?id=135

suever commented 9 years ago

From Suever@gmail.com on November 09, 2013 12:00:13

I cannot get your problem to reproduce when manually modifying the rtplan.dcm in the testfiles folder of pydicom (code attached for modification).

When running your file and substituting the original rtplan.dcm and modified version I get:

0.9.7 Original first data point: 1.0 New first data point: 1234.567 Read in saved first point: 1234.567 Point was not updated?: False Changed point saved correctly?: False

Also, note that at least in 0.9.7, your comparison in the last line of the script will not be true because NewFirstPoint is a float while ModifiedFirstPoint is a string since DVHData is of type DS. We changed this behavior in 0.9.8.

Do you have the same issues when using the most recent version of pydicom?

Also feel free to post a de-identified file which exhibits the problem that you're seeing so that we can reproduce the error.

Attachment: add_DVHData.py

suever commented 9 years ago

From bton...@gmail.com on November 10, 2013 05:11:19

OK, it could be the DS data type problem, and the float vs. string problem. I'm on travel for a few days but will test those ideas when I get back. I think it could well be that my assignments over-ride a data type, like from string to float, and then this generates a silent error in write_file. Still not clear on how write_file could possibly "remember" the original assignment. Does the pydicom object have a "shadow" copy of DVHData? When I get back I'll post an anonymous version of the troubling data file. It was a problem with RT_DOSE files from more than one planning system.

suever commented 9 years ago

From bton...@gmail.com on November 15, 2013 14:12:50

The problem was not with ds.write_file, but with assignment to an element of DVHData.

Specifically:

Make a dataset for the dvh

dvh = dcm.dataset.Dataset()

Assign the dvh "all at once", which will work properly

dvh.DVHData = ['1.234','5.678'] print type(dvh.DVHData[0]) #prints <class 'dicom.valuerep.DSfloat'> dvh.DVHData[0] = '8.987' print type(dvh.DVHData[0]) #prints <type 'str'>

I could not find a "proper" way of assigning to an individual element of the dataset. These all failed, storing strings/floats instead of DSfloats, or not allowing assignment: dvh.data_element('DVHData').value[0]='3.14' #assigns, but as string dvh.data_element('DVHData').value[0]=3.14 #assigns, but as float dvh[0x3004,0x0058][0]='3.14' #does not support assignment

Is there a way to do this properly? The work around was to make a temp object myDVHData, work with it, and then assign it to dvh.DVHData in one step. Seems like you should be able to manipulate elements dvh.DVHData[index] directly, though.

suever commented 9 years ago

From Suever@gmail.com on November 15, 2013 17:52:21

It does seem that there is a type mis-match when you manually assign a single value of the DVHData DS, and I believe that this behavior was corrected in the most recent release 0.9.8. On the other hand, when you save the file, even if it is identifying itself as a str it should save and then load properly as a DS VR. Did you try this?

In the two of the three cases that you showed for assigning (three of four if you count the first example), although the type of the value right after assignment is what you entered, the file saves appropriately. Again, this is fixed in 0.9.8 but it should still be functional in 0.9.7

Finally, the last way that you tried to do the assignment didn't work because dvh[0x3004,0x0058] returns a data element which requires you to use the value property to do the actual assignment so the following would work:

dvh[0x3004,0x0058].value[0]='3.14'

Try saving the file and then reloading to ensure that these work for you.

suever commented 9 years ago

From bton...@gmail.com on November 19, 2013 17:29:44

I've tested the simple cases of assignment to an element of DVHData in 0.9.6 and 0.9.8, and yes, the behavior is quite different (in 0.9.6, the internal representation returned by dataset.data_element is different from that accessed as a list element).

Also, in 0.9.8 the incorrect type assignment can be "white-washed" by a write/read cycle, as you suggest. This doesn't seem very efficient, though.

Sticking to just 0.9.8, my feeling about all this is that this behavior isn't consistent with the project design goal to "to have a single copy of the data" (quoting from the developer's guide).

I ran into this issue earlier, when I had trouble getting overrides to work when DS got changed to decimal-string (I now know why that didn't work--my overrides tested for data element type).

I would like to avoid making copies of all dicom objects, since that just makes it easy to lose track of them. The current behavior of assignment (silently accepting a type change that is inconsistent with the VR), isn't going to work well in my projects. My workaround is to work with copies of the data, and only use all-at-once assignments of element lists--no assignments to individual members of the list.

My feeling is that this would all be safer, if pydicom exposed (and required?) a setter/getter for accessing data elements. Has that been considered (i.e. explicit get()/set() or get_value/set_value)? I see you're using "value" as a setter/getter in dataelem. Is it possible to modify self.value in DataElement to typecast from strings (and floats) to valuerep.DSfloat, when self.VR == DS?

And, it goes without saying, thanks for a great library.

suever commented 9 years ago

From Suever@gmail.com on November 03, 2014 12:52:50

To follow up on this issue. All but one of the use cases has been corrected in versions >= 0.9.8. The remaining case where the expected behavior isn't observer is when the DataElement.value is a list (VM > 1) and the user attempts to modify this value directly (i.e. DataElement.value.append(new_value)).

The current implementation of the DataElement.value setter/getter is that on setting, the value (or values) are converted to the proper type (i.e. str to DSfloat). When the value of an item is retrieved, no conversion occurs but rather the _value is returned. This is for performance reasons as we only want to convert data once.

The issue occurs when the list() used for DataElement.value is modified directly (via append(), insert() etc), the DataElement.value setter is never triggered since the list reference never changes, just the data inside of it. Ideally we would like to trigger the DataElement.value setter when the CONTENTS of the DataElement.value list are modified.

Unfortunately, it seems like this isn't easy to do apart from creating a wrapper (or proxy) for the built-in list datatype that overloads the append(), insert(), setitem() methods to convert the input data prior to inserting it into the list. I've seen some examples using collections.MutableSequence. I'm not sure this is a good path to go down or not.