pydicom / deid

best effort anonymization for medical images using python
https://pydicom.github.io/deid/
MIT License
140 stars 43 forks source link

Issue with a DICOM file that contains a list in the ImageType element #128

Closed SpiroGanas closed 4 years ago

SpiroGanas commented 4 years ago

Hello,

One of my DICOM files has an ImageType element that is a list:

image

Could you please update line 158 of filter.py to handle this situation?

https://github.com/pydicom/deid/blob/8ca068ea17a6c7c2997c12105c1b9ae1f460470b/deid/dicom/filter.py#L152-L163

Thank you very much,

vsoch commented 4 years ago

Happy to! Could you please provide a file and code snippet to reproduce the bug, and the version of deid? I’ll put in some time tomorrow. Thank you!

SpiroGanas commented 4 years ago

deid version: 0.2.0

This code adds a few fields to your example DICOM file, and then runs the code that gave me an exception:

import pydicom
from deid.dicom import get_files 
from deid.data import get_dataset
from deid.dicom import DicomCleaner

base = get_dataset('dicom-cookies')
dicom_files = list(get_files(base)) 
dicom_file = dicom_files[3]

ds = pydicom.dcmread(dicom_file)

# An exception occurs in the deid test file because there is no manufacturer element in the dicom
# So I added this to silence that error
ds.Manufacturer = 'TOSHIBA'

# Add the multi-value ImageType element to the sample DICOM file and save the file
# This is what caused the error in my real CT Scan
ds.ImageType = ['ORIGINAL', 'PRIMARY', 'LOCALIZER']

print(ds)
ds.save_as('./dicom_file_example.dcm')

client = DicomCleaner()
client.detect('./dicom_file_example.dcm')
client.clean()
vsoch commented 4 years ago

okay taking a look! I'll also add the ability to load the dataset from the ds object (instead of needing to save first). I'm thinking that we had a huge factor of the DicomParser and didn't check all these functions in the cleaner. I'll know soon.

vsoch commented 4 years ago

Here is the error I'm hitting - does this reproduce your error?

~/Desktop/Code/deid/deid/dicom/filter.py in empty(self, field)
    157     # This is the case of a data element
    158     if not isinstance(content, str):
--> 159         content = content.value
    160 
    161     if content == "":

AttributeError: 'NoneType' object has no attribute 'value'
vsoch commented 4 years ago

This one too:

~/Desktop/Code/deid/deid/dicom/filter.py in empty(self, field)
    161     # This is the case of a data element
    162     if not isinstance(content, str):
--> 163         content = content.value
    164 
    165     if content == "":

AttributeError: 'MultiValue' object has no attribute 'value'
vsoch commented 4 years ago

Working on it now!

vsoch commented 4 years ago

@SpiroGanas I have a PR for you to test! https://github.com/pydicom/deid/pull/129. I want to suggest that if you want custom parsing of your dicoms, to look at the DicomParser as an alternative to the DicomCleaner: https://pydicom.github.io/deid/getting-started/dicom-put/. They are fairly similar, the main difference is that the cleaner is optimized to return files based on lists they quality for (e.g. whitelist, blacklist) and the DicomParser is just given a deid recipe, and then performs the actions required. The DicomCleaner actually needs to be refactored to use the parser, but nobody has needed it yet (and I tend to develop it based on a needs basis). Anyway - the PR should fix the bug that you outlined here, and we can discuss further how you need to use deid / if the cleaner needs updating.