pydicom / deid

best effort anonymization for medical images using python
https://pydicom.github.io/deid/
MIT License
140 stars 43 forks source link

How to obtain DICOM header field's specific value for use in replacement function and how to preserve DICOM folder structure #169

Closed jkarten95 closed 3 years ago

jkarten95 commented 3 years ago

Hi All,

I was wondering how to obtain the value of a DICOM field within a function I am using to replace this value? For example, I would like to use the original Patient ID field value in order to generate the new Patient ID field value. I seem to be able to obtain a DicomField object, but I do not see an attribute to return the specific value for the tag/field of interest.

Additionally, in using the replace_identifiers() method, the initial folder structure of my DICOM studies seems to be erased. Is there a way around this happening?

Thanks for any help!

vsoch commented 3 years ago

See the generate_uid function example https://pydicom.github.io/deid/examples/func-replace/

jkarten95 commented 3 years ago

@vsoch unless I misunderstand, that example returns field.name which is the name of the DICOM tag. Is there a way to extract the specific number/string associated with that tag? I.e. If I have patientID as my tag, how do I extract patientID = 1234? Thanks for your help.

vsoch commented 3 years ago

You can return whatever you like @jkarten95! The example returns a custom UID string that is not related to the field.

def generate_uid(item, value, field, dicom):
    '''This function will generate a uuid! You can expect it to be passed
       the dictionary of items extracted from the dicom (and your function)
       and variables, the original value (func:generate_uid) and the field
       name you are applying it to.
    '''
    import uuid
    # a field can either be just the name string, or a DicomElement
    if hasattr(field, 'name'):
        field = field.name
    prefix = field.lower().replace(' ', " ")
    return prefix + "-" + str(uuid.uuid4())

So I'd try something like:

def generate_uid(item, value, field, dicom):
    if hasattr(field, 'name'):
        field = field.name
    if field.lower != "patientid":
        # there is probably a better way to do this
        return getattr(dicom, field)
    return dicom.PatientId

Note that first example on the docs page is missing the "dicom" parameter, and there is a good example for how to do development. I'll need to fix that doc page on some upcoming weekend.

Try that and let me know how it goes! You can debug by printing the item, value, and field when you test (it's been a while and this is off the top of my head).

vsoch commented 3 years ago

You probably don't need the line to check the field because in your deid recipe you can just do:

REPLACE PatientID func:generate_uid`
wetzelj commented 3 years ago

For my purposes I was able to read the file and append the original values on to the array that is passed into the ids parameter of replace_identifiers(). In my use case, I'm already reading the file for some other purposes, so introducing this step to add these into the ids array was of minimal impact. The original field values are then accessible as variables for use in the recipe.

With this functionality, our end users have an option to select which fields for which they want to be able to stash-off the original value. In the recipe, they simply use the original value variable as they would any variable. I chose to make this an end-user selected list of fields, rather than caching off the entire original header content. In my application, all of this happens when preparing the call to replace_identifiers().

REMOVE PatientID
ADD PatientID var:original_PatientID

The "original" is just a convention I chose. Our end users just select the fieldname, and then know they can reference any selected fields as var:original.

jkarten95 commented 3 years ago

Thank you both so much for the help! I was able to complete the code.