KitwareMedical / dicom-anonymizer

Tool to anonymize DICOM files according to the DICOM standard
BSD 3-Clause "New" or "Revised" License
104 stars 47 forks source link

Proposal: Make get_UID Pluggable #86

Open smasuda opened 4 months ago

smasuda commented 4 months ago

Hi there!

I'd like to propose making get_UID pluggable so that users can implement their own version without affecting others.

In certain cases, it is necessary to keep track of UID changes by recording the UIDs before and after modifications over a series of anonymizations done in separate processes. Additionally, some users might want to use a specific prefix to generate their own UIDs. The current implementation doesn't allow this since get_UID is tightly coupled with the code, and its dictionary is volatile.

If this proposal is acceptable, I am willing to raise a PR for review. I would like to hear feedback on whether this change is agreeable.

pchoisel commented 4 months ago

Hi @smasuda,

Your usecase is definitely valid to me. However, I'm wondering if your implementation could benefit everyone.

What about this: dicom-anonymizer now accepts a new argument: a path to a json file mapping old UID to new UID. After the end of dicom-anonymizer execution, the json file is eventually (optionally) updated with any new UID that might have been generated. That way, you could keep the same UID replacement between several datasets.

What do you think about this ? If this does not suit you, I would also agree with making the function replace_element_UID or get_UID injectable.

smasuda commented 4 months ago

Hi @pchoisel - thanks for your feedback! re: the json file approach, I agree with you that it will be surly suffice for most of the use cases such as an anonymization in a labo, however, it may face a scalability issue when you work with a large dataset with multiple members/teams, which is the original motivation of mine and I'd still want to have a solution.

By making replace_element_UID injectable, one could implement the file approach as well if necessary. What do you think?

pchoisel commented 4 months ago

@smasuda, no worries.

I'm ok with making replace_element_UID injectable, you can go ahead and open a PR.

Thanks !

smasuda commented 4 months ago

@pchoisel actually I came to have a second thought - once this merged PR ( https://github.com/KitwareMedical/dicom-anonymizer/pull/83 ) is released, we could achieve the pluggable replace_element_UID via defining your own rule generator.. Not knowing the release cycle of this project, yet would you be able to consider making a small release?

pchoisel commented 3 months ago

Yes, I think we are having enough small features to create a minor release. I can wait after this is merged to create it.

smasuda commented 3 months ago

@pchoisel

I can wait after this is merged to create it.

Just to be on the same page, I'm waiting for the release; after the release I'll try to see replace_element_UID is injectable. Sorry if I'm not clear enough about the steps to take.

pchoisel commented 3 months ago

All right, I understood it wrong then I'll try to make a release this week

sjswerdloff commented 3 months ago

@pchoisel

I can wait after this is merged to create it.

Just to be on the same page, I'm waiting for the release; after the release I'll try to see replace_element_UID is injectable. Sorry if I'm not clear enough about the steps to take.

I'd be very interested in this. I wrote a pseudonymiser that is in PyMedPhys (experimental) and a key aspect of that was to have the UIDs stay consistent amongst objects (referenced UIDs, Study and Series UIDs). To avoid rainbow table attacks against hashed values, I added pepper (which could be shared within a facility to allow for consistent forward conversion of data, but would need to be kept sufficiently secure).

pchoisel commented 2 months ago

Release done !