pydicom / deid

best effort anonymization for medical images using python
https://pydicom.github.io/deid/
MIT License
146 stars 44 forks source link

Pixel Data with undefined length must start with an item tag #65

Closed fimafurman closed 6 years ago

fimafurman commented 6 years ago

The following image: IMG00001.dcm.zip

The error happens after clean() method successfully returns (blanking out coordinates supplied) and save_dicom method is called.

With tag (7fe0, 0010) got exception: Pixel Data with undefined length must start with an item tag Traceback (most recent call last): File "/data/anaconda3/lib/python3.6/site-packages/pydicom/tag.py", line 30, in tag_in_exception yield File "/data/anaconda3/lib/python3.6/site-packages/pydicom/filewriter.py", line 475, in write_dataset write_data_element(fp, dataset.get_item(tag), dataset_encoding) File "/data/anaconda3/lib/python3.6/site-packages/pydicom/filewriter.py", line 435, in write_data_element raise ValueError('Pixel Data with undefined length must ' ValueError: Pixel Data with undefined length must start with an item tag

vsoch commented 6 years ago

Awesome thank you @fimafurman this is perfect for debugging! I'm going for a quick last minute run before the storm but I'll be able to take a look at this for you later today. Stay tuned!

fimafurman commented 6 years ago

Stay safe!

vsoch commented 6 years ago

Okay here is my investigation:

from deid.dicom.pixels.clean import DicomCleaner
mrclean = DicomCleaner()

Then I detect

mrclean.detect('/home/vanessa/Desktop/IMG00001.dcm')
Out[3]: 
{'flagged': True,
 'results': [{'coordinates': ['0,0,800,59'],
   'group': 'graylist',
   'reason': ' Modality contains US and Manufacturer contains Philips and Rows equals 768 and ManufacturerModelName contains EPIQ'},
  {'coordinates': [],
   'group': 'blacklist',
   'reason': ' ImageType contains SAVE and Modality contains CT|MR or SeriesDescription contains SAVE or BurnedInAnnotation contains YES or ImageType empty  or DateOfSecondaryCapture empty  or SecondaryCaptureDeviceManufacturer empty  or SecondaryCaptureDeviceManufacturerModelName empty  or SecondaryCaptureDeviceSoftwareVersions empty '}]}

and then clean.

{'flagged': True,
 'results': [{'coordinates': ['0,0,800,59'],
   'group': 'graylist',
   'reason': ' Modality contains US and Manufacturer contains Philips and Rows equals 768 and ManufacturerModelName contains EPIQ'},
  {'coordinates': [],
   'group': 'blacklist',
   'reason': ' ImageType contains SAVE and Modality contains CT|MR or SeriesDescription contains SAVE or BurnedInAnnotation contains YES or ImageType empty  or DateOfSecondaryCapture empty  or SecondaryCaptureDeviceManufacturer empty  or SecondaryCaptureDeviceManufacturerModelName empty  or SecondaryCaptureDeviceSoftwareVersions empty '}]}

In [3]: mrclean.clean()
Scrubbing /home/vanessa/Desktop/IMG00001.dcm.

and reproduced the error:

ValueError: With tag (7fe0, 0010) got exception: Pixel Data with undefined length must start with an item tag
Traceback (most recent call last):
  File "/home/vanessa/anaconda3/lib/python3.6/site-packages/pydicom-1.1.0-py3.6.egg/pydicom/tag.py", line 30, in tag_in_exception
    yield
  File "/home/vanessa/anaconda3/lib/python3.6/site-packages/pydicom-1.1.0-py3.6.egg/pydicom/filewriter.py", line 475, in write_dataset
    write_data_element(fp, dataset.get_item(tag), dataset_encoding)
  File "/home/vanessa/anaconda3/lib/python3.6/site-packages/pydicom-1.1.0-py3.6.egg/pydicom/filewriter.py", line 435, in write_data_element
    raise ValueError('Pixel Data with undefined length must '
ValueError: Pixel Data with undefined length must start with an item tag

Will report back after some more work (and I haven't gone on my run yet, haha. Just wanted to let you know I safely reproduced :)

fimafurman commented 6 years ago

Wow, you're awesome! :)

vsoch commented 6 years ago

One quick note - if you save a png:

mrclean.save_png()

that seems to work okay! So that gives us a hint it's just something with the header. The attempted save for the dicom file in the same folder doesn't open with a standard viewer with some message about the header.

vsoch commented 6 years ago

Okay now I'm doing it interactively:

I ran the equivalent of

mrclean.detect()
mrclean.clean()

and then the coordinate list is the region we had seen earlier

 coordinates
[[0, 0, 800, 59]]

Then we fill in with black (0). And then the save is actually just a native call to pydicom, what I thought:

dicom.PixelData = self.cleaned.tostring()
dicom.save_as(dicom_name)

and the error is triggered.

Traceback (most recent call last):
  File "/home/vanessa/anaconda3/lib/python3.6/site-packages/pydicom-1.1.0-py3.6.egg/pydicom/tag.py", line 30, in tag_in_exception
    yield
  File "/home/vanessa/anaconda3/lib/python3.6/site-packages/pydicom-1.1.0-py3.6.egg/pydicom/filewriter.py", line 475, in write_dataset
    write_data_element(fp, dataset.get_item(tag), dataset_encoding)
  File "/home/vanessa/anaconda3/lib/python3.6/site-packages/pydicom-1.1.0-py3.6.egg/pydicom/filewriter.py", line 435, in write_data_element
    raise ValueError('Pixel Data with undefined length must '
ValueError: Pixel Data with undefined length must start with an item tag

So I think we need to ask the maintainers of pydicom how to deal with this.

scaramallion commented 6 years ago

Hi @fimafurman, is it OK if we include your dataset in pydicom as one of the test files?

fimafurman commented 6 years ago

Yes.

On Thu, Sep 13, 2018 at 7:49 PM scaramallion notifications@github.com wrote:

Hi @fimafurman https://github.com/fimafurman, is it OK if we include your dataset in pydicom as part of the test files?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydicom/deid/issues/65#issuecomment-421187145, or mute the thread https://github.com/notifications/unsubscribe-auth/Ak2KXCdzkPX2sHP5PabbH0u0w_OWwkLiks5uau70gaJpZM4WnMdW .