pydicom / deid

best effort anonymization for medical images using python
https://pydicom.github.io/deid/
MIT License
140 stars 43 forks source link

Adding tag groups for values and fields #120

Closed vsoch closed 4 years ago

vsoch commented 4 years ago

Description

Related issues: #115

This is the start of work to add the definition of values and fields list to the deid recipe. So far I've:

I haven't tested the second bullet above, and likely will tweak the work based on that.

I still need to:

I will try to work on this a little bit at a time - I'm opening a work in progress PR for now to indicate that this still needs work.

vsoch commented 4 years ago

This is in a good state and ready for review. @wetzelj I've finished adding the functionality that we discussed, along with some basic documentation. The best examples of interacting with get and replace identifiers are likely found in the test files. I wrote documentation for the sections, but didn't write a verbose user tutorial because I'm thinking that #119 will be hugely easier to use (and teach) as a wrapper around get and set, and where you can add functions, define the recipe, etc. So my suggestions for moving forward:

Please take whatever time you need to discuss and review, I'm going to work on other things for the rest of the afternoon.

vsoch commented 4 years ago

@wetzelj let me know when you've been able to test and review, I have a lot of projects at once so I'm working on one feature at a time for deid (and the other projects too!). I think (based on the issues with setting/ getting values, and your comments with JITTER) we should work on that after this.

vsoch commented 4 years ago

And please no rush! I have a crapton of things to work on, as I suspect that you do too, and the current state of the world doesn't make that easier :P

wetzelj commented 4 years ago

Too many meetings today! :( I've started to look, but haven't been able to dig in yet. I'll have more dedicated time tomorrow.

wetzelj commented 4 years ago

FYI... There's another print() statement in deid\dicom\actions.py - line 54.

vsoch commented 4 years ago

okay, removed. I'm probably trying to do too many things at once :/

wetzelj commented 4 years ago

Regarding the plan forward... #118 is really one of the higher priority issues from our standpoint, but if you think that #119 should be done first, that's okay too.

One other item that I experienced when testing - but I'm not too sure where it should go... During our discussion on #112 (comment), you mentioned that: REMOVE ALL re:(\d{7,0}) fits into the current structure.

In testing, I found that this functionality isn't present yet... with this rule included in the recipe, we fall into the else - "re is invalid variable type for REMOVE."

Would you prefer that this be opened as a new issue or incorporated into one of the others?

vsoch commented 4 years ago

That's correct - it's not implemented but instead the REMOVE all func:<function> is supported. I could look into adding this next - it looks like we could use the same functions in dicom/filter.py so the supported actions would be:

REMOVE <field> contains:expression
REMOVE <field> notcontains:expression
REMOVE <field> equals:expression
REMOVE <field> missing
REMOVE <field> notequals:expression
REMOVE <field> present
REMOVE <field> empty

This PR is already quite large - would you be okay with merging this, releasing, and then I'll follow up with adding the REMOVE tags? Note that REMOVE ALL re: is equivalent to REMOVE ALL contains: it's just more non programmer friendly to understand contains over re.

wetzelj commented 4 years ago

That's perfectly fine with me.

vsoch commented 4 years ago

Awesome! I'll get this in asap, and I am going into 2 hours of calls but I'll try to be sneaking and really be programming at the same time :) I know these fixes are important to you, and I want to support that with maintaining a good sense of urgency to keep things moving!

vsoch commented 4 years ago

https://pypi.org/project/deid/0.1.4/

And conda usually shows up within a few hours. Starting on the additions to remove now (I'm so sneaky!!)