pydicom / deid

best effort anonymization for medical images using python
https://pydicom.github.io/deid/
MIT License
140 stars 43 forks source link

Adding ability for deid to support deid-provided functions #208

Closed vsoch closed 2 years ago

vsoch commented 2 years ago

As of this version, a user can specify a value as a deid_func: meaning that we use a deid provided function. This is useful for providing some general uid and a customized jitter function, along with other functions that users might want to add. With this change I have provided docs and a contributing section for how to do this, along with running pyflakes on the tests.

This will close #207 and will assist with #206

Signed-off-by: vsoch vsoch@users.noreply.github.com

vsoch commented 2 years ago

Ping @moloney and @wetzelj and @cfhammill !

moloney commented 2 years ago

Sorry for dropping off on this, I am taking a look now. One immediate comment: why not use the pydicom "generate_uid" function as I had done previously? It can replace all the code in "deid/dicom/actions/uids.py", and it can provide purely random UIDs as you are doing in your code or it can provide a consistent mapping (same input UID produces same anonymized UID) so that study / series UIDs remain consistent after anonymization without any coordination between the processing of the separate files. This latter functionality is critical for many use cases.

Thanks for the hard work here!

vsoch commented 2 years ago

@moloney that's a fantastic idea! I 100% agree, and I feel silly that I didn't know about it. Can you suggest a name / args etc of how it should look?

moloney commented 2 years ago

There are just two optional args, a prefix (defaults to the pydicom root prefix) and an optional list of things to use as entropy when generating the UID (if nothing is provided it just uses system randomness). If you pass the original UID in as the only entropy source you will get a stable result across all files (same input results in same output).

Docs are here: https://pydicom.github.io/pydicom/dev/reference/generated/pydicom.uid.generate_uid.html

So I guess the deid func should just take two optional args, a prefix and a toggle to enable / disable stable remapping (i.e. passing in the original UID) which I think should default to being turned on.

vsoch commented 2 years ago

@moloney for the remapping bit - does that mean passing the original value as the prefix or in the list of entopy items (it sounds like the second for the stability?) I just want to double check before writing it!

vsoch commented 2 years ago

okay I think I have it figured out! I'll have an update to the PR for you after work today.

vsoch commented 2 years ago

@moloney all set! https://github.com/pydicom/deid/pull/208/commits/bca9596ea2766bac7cb9838a9b04f265156d3d5e

vsoch commented 2 years ago

heads up y'all! I'm going to merge this end of day - I think it looks fairly good and is an improvement to deid and we've had overall positive review.

wetzelj commented 2 years ago

Sounds good to me!