suyashdb / pydicom

Automatically exported from code.google.com/p/pydicom
0 stars 0 forks source link

Is this a bug in write_file? #135

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
(code stub attached)
1. Modify a valid dvh value: ds1.DVHs[0].DVHData[0] = NewFirstPoint
2. Save using ds1.write_file, and read back again into new ds2
3. Read-in data is not updated correctly; ds1.DVHs[0].DVHData[0] <> 
ds2.DVHS[0].DVHData[0]. The data stored reflects the ORIGINAL data in ds1 
(which means that write_file is getting the data from a shadowed copy of the 
data? Subtle.)

What is the expected output? What do you see instead?

Sample code attached. Used a valid RT Dose file with DVH's as a starting point. 
Later, tried creating DVH's from scratch, and found a similar behavior. Once 
DVHData is appended to DVHs[index], elements of DVHData that are changed 
programmatically are not saved with write_file. 
Is this a bug, or have I missed a step? 

What version of the product are you using?
Tried this with both 0.9.6 and 0.9.7

***NOTE***: any text or attached files posted with the issue can be viewed
by anyone. You are solely responsible to ensure that they contain no
confidential information of any kind.

Please provide any additional information below.
CODE STUB ATTACHED

Original issue reported on code.google.com by bton...@gmail.com on 9 Nov 2013 at 6:17

Attachments:

GoogleCodeExporter commented 8 years ago
I cannot get your problem to reproduce when manually modifying the rtplan.dcm 
in the testfiles folder of pydicom (code attached for modification). 

When running your file and substituting the original rtplan.dcm and modified 
version I get:

0.9.7
Original first data point:  1.0
New first data point:  1234.567
Read in saved first point:  1234.567
Point was not updated?:  False
Changed point saved correctly?:  False

Also, note that at least in 0.9.7, your comparison in the last line of the 
script will not be true because NewFirstPoint is a float while 
ModifiedFirstPoint is a string since DVHData is of type DS. We changed this 
behavior in 0.9.8.

Do you have the same issues when using the most recent version of pydicom?

Also feel free to post a de-identified file which exhibits the problem that 
you're seeing so that we can reproduce the error.

Original comment by Suever@gmail.com on 9 Nov 2013 at 8:00

Attachments:

GoogleCodeExporter commented 8 years ago
OK, it could be the DS data type problem, and the float vs. string problem.
I'm on travel for a few days but will test those ideas when I get back. I
think it could well be that my assignments over-ride a data type, like from
string to float, and then this generates a silent error in write_file.
Still not clear on how write_file could possibly "remember" the original
assignment. Does the pydicom object have a "shadow" copy of DVHData?
When I get back I'll post an anonymous version of the troubling data file.
It was a problem with RT_DOSE files from more than one planning system.

Original comment by bton...@gmail.com on 10 Nov 2013 at 1:11

GoogleCodeExporter commented 8 years ago
The problem was not with ds.write_file, but with assignment to an element of 
DVHData.

Specifically:
#Make a dataset for the dvh
dvh = dcm.dataset.Dataset()
#Assign the dvh "all at once", which will work properly
dvh.DVHData = ['1.234','5.678'] 
print type(dvh.DVHData[0]) #prints <class 'dicom.valuerep.DSfloat'>
dvh.DVHData[0] = '8.987'
print type(dvh.DVHData[0]) #prints <type 'str'>

I could not find a "proper" way of assigning to an individual element of the 
dataset.
These all failed, storing strings/floats instead of DSfloats, or not allowing 
assignment:
dvh.data_element('DVHData').value[0]='3.14' #assigns, but as string
dvh.data_element('DVHData').value[0]=3.14 #assigns, but as float
dvh[0x3004,0x0058][0]='3.14' #does not support assignment

Is there a way to do this properly? The work around was to make a temp object 
myDVHData, work with it, and then assign it to dvh.DVHData in one step. Seems 
like you should be able to manipulate elements dvh.DVHData[index] directly, 
though.

Original comment by bton...@gmail.com on 15 Nov 2013 at 10:12

GoogleCodeExporter commented 8 years ago
It does seem that there is a type mis-match when you manually assign a single 
value of the DVHData DS, and I believe that this behavior was corrected in the 
most recent release 0.9.8. On the other hand, when you save the file, even if 
it is identifying itself as a str it should save and then load properly as a DS 
VR. Did you try this?

In the two of the three cases that you showed for assigning (three of four if 
you count the first example), although the type of the value right after 
assignment is what you entered, the file saves appropriately. Again, this is 
fixed in 0.9.8 but it should still be functional in 0.9.7

Finally, the last way that you tried to do the assignment didn't work because 
dvh[0x3004,0x0058] returns a data element which requires you to use the value 
property to do the actual assignment so the following would work:

dvh[0x3004,0x0058].value[0]='3.14'

Try saving the file and then reloading to ensure that these work for you.

Original comment by Suever@gmail.com on 16 Nov 2013 at 1:52

GoogleCodeExporter commented 8 years ago
I've tested the simple cases of assignment to an element of DVHData in 0.9.6 
and 0.9.8, and yes, the behavior is quite different (in 0.9.6, the internal 
representation returned by dataset.data_element is different from that accessed 
as a list element). 

Also, in 0.9.8 the incorrect type assignment can be "white-washed" by a 
write/read cycle, as you suggest. This doesn't seem very efficient, though.

Sticking to just 0.9.8, my feeling about all this is that this behavior isn't 
consistent with the project design goal to "to have a single copy of the data" 
(quoting from the developer's guide).

I ran into this issue earlier, when I had trouble getting overrides to work 
when DS got changed to decimal-string (I now know why that didn't work--my 
overrides tested for data element type).

I would like to avoid making copies of all dicom objects, since that just makes 
it easy to lose track of them. The current behavior of assignment (silently 
accepting a type change that is inconsistent with the VR), isn't going to work 
well in my projects. My workaround is to work with copies of the data, and only 
use all-at-once assignments of element lists--no assignments to individual 
members of the list.

My feeling is that this would all be safer, if pydicom exposed (and required?) 
a setter/getter for accessing data elements. Has that been considered (i.e. 
explicit get()/set() or get_value/set_value)? I see you're using "value" as a 
setter/getter in dataelem. Is it possible to modify self.value in DataElement 
to typecast from strings (and floats) to valuerep.DSfloat, when self.VR == DS?

And, it goes without saying, thanks for a great library.

Original comment by bton...@gmail.com on 20 Nov 2013 at 1:29

GoogleCodeExporter commented 8 years ago
To follow up on this issue. All but one of the use cases has been corrected in 
versions >= 0.9.8. The remaining case where the expected behavior isn't 
observer is when the DataElement.value is a list (VM > 1) and the user attempts 
to modify this value directly (i.e. DataElement.value.append(new_value)).

The current implementation of the DataElement.value setter/getter is that on 
setting, the value (or values) are converted to the proper type (i.e. str to 
DSfloat). When the value of an item is retrieved, no conversion occurs but 
rather the _value is returned. This is for performance reasons as we only want 
to convert data once.

The issue occurs when the list() used for DataElement.value is modified 
directly (via append(), insert() etc), the DataElement.value setter is never 
triggered since the list reference never changes, just the data inside of it. 
Ideally we would like to trigger the DataElement.value setter when the CONTENTS 
of the DataElement.value list are modified.

Unfortunately, it seems like this isn't easy to do apart from creating a 
wrapper (or proxy) for the built-in list datatype that overloads the append(), 
insert(), __setitem__() methods to convert the input data prior to inserting it 
into the list. I've seen some examples using collections.MutableSequence. I'm 
not sure this is a good path to go down or not.

Original comment by Suever@gmail.com on 3 Nov 2014 at 8:52