pydicom / deid

best effort anonymization for medical images using python
https://pydicom.github.io/deid/
MIT License
140 stars 43 forks source link

#197 'remove and replace/jitter combo' - added once-evaluated list of… #204

Closed glebsts closed 2 years ago

glebsts commented 2 years ago

Description

Whitelisting was not possible for replacing. Now fields being replaced or jittered are excluded from removal (either REMOVE ALL or REMOVE SomeField).

It is done using separate property and private variable to keep result (lazy eval).

Made clarification regarding KEEP list - those fields are being removed from field list to process, meaning that KEEP SomeField + REPLACE SomeField ... won't work, as kept field is being excluded from list of fields to process for other actions.

Added tests. Bumped release candidate number and changelog (+small fix for previous change)

Minor docs update.

Related issues: #197 , #198

Please include a summary of the change(s) and if relevant, any related issues above.

Checklist

Open questions

codestyle on excluded_from_deletion property is discussable, opened for suggestions.

vsoch commented 2 years ago

This looks perfecto! And it makes total sense - if you have an opinion about changing a field, it doesn't make sense to then remove it later. I'm wondering - is there any reason we wouldn't want to apply this to ADD too? Pinging @wetzelj too if he wants to take a look!

glebsts commented 2 years ago

is there any reason we wouldn't want to apply this to ADD too

https://github.com/pydicom/deid/blob/0.2.29-rc/deid/tests/test_replace_identifiers.py#L749 test_remove_all_add_field_compounding_should_add So it is kinda applied already, and docs say it :)

wetzelj commented 2 years ago

If it's okay with you two, I'll review it all once it's merged into 0.2.29-rc and I've been able to upgrade my application to that version.

To confirm my understanding of the functionality changes:

Assuming my understanding is correct, I think the second bullet above makes sense, and I'm on board with it too, even though its a change to the way current functional recipes currently work in 0.2.28. This change would help to ensure that we have cleaner recipes with less redundant and/or irrelevant commands.

Is my understanding correct?

glebsts commented 2 years ago

@wetzelj your understood it correctly. My recipe before was

REMOVE FieldA
REMOVE FieldC
REPLACE FieldB 'foo'

As my task is maximum anonymization and only leaving study data, and there are hundreds of tags, such recipe was very long and error-prone.

Now with this change I can do

REMOVE ALL
REPLACE FieldB 'foo'

only setting to keep what I need. Before it was blacklist, now it it whitelist, which is better,

wetzelj commented 2 years ago

Got it... I think this totally makes sense and is a good change to make.

@vsoch - Given the change in behavior for recipes, do you think it would warrant bumping this to 0.3.0 instead of 0.2.29?

vsoch commented 2 years ago

yep I'm on board with that!