Closed howardpchen closed 5 years ago
I think we do have a use case for this, I released it recently - see this example https://github.com/pydicom/deid/tree/master/examples/dicom/header-manipulation
I think perhaps the addition you would need is a way to apply the function to ALL tags? Right now, the use case is for a specific tag like:
REPLACE StudyInstanceUID func:generate_uid
and then you would pass into the items dictionary like:
for item in items:
items[item]['generate_uid'] = generate_uid
Maybe something like this would work to specify all tags?
REPLACE * func:clean_data
And you would still add it
for item in items:
items[item]['generate_uid'] = clean_data
Does the * seem appropriate to indicate all tags? or should we have a keyword ALL (assuming that it's never going to be an actual tag :OP)
REPLACE ALL func:clean_data
As a developer * is more intuitive, but for users that aren't programmers ALL might be a better choice. What do you think?
If you have something very specific / custom you could also just run it over the fields after you extract them, e.g.,
new_items = dict()
for field, value in items['items']:
new_items[field] = func(value)
items['items'] = new_items
Agree that * is less intuitive to non programmers, who seem to be the key target audience for the Recipes mechanics.
I was thinking to leverage existing field expander mechanism which can handle
REPLACE allFields: func:clean_data
with a few extra lines:
if expander.lower() == "endswith":
fields = [x for x in contenders if x.endswith(expression)]
elif expander.lower() == "startswith":
fields = [x for x in contenders if x.startswith(expression)]
elif expander.lower() == "contains":
fields = [x for x in contenders if expression in x]
elif expander.lower() == "allfields":
fields = contenders
The code also includes a 'contains' in addition to endswith and startswith. The use case is for something like Date and DateTime:
JITTER contains:Date 123
Covers both fields like StudyDate and AcquisitionDateTime. Something else that tends to be in the middle of a field is "SOP" like Referenced SOP Class UID in File:
REMOVE contains:SOP
Actually re: all fields I think it does make more sense to do it without the colon for the user, as you suggested:
REPLACE ALL func:clean_data
I feel pretty good that All won't become a standard DICOM tag soon :)
With the last amendment, I agree on all fronts! It wouldn't be good to have something like
JITTER allFields: 123
to indicate all fields, because we would want to catch that as an error that the user possibly parsed the command incorrectly.
I'm having dinner now, but I'll be happy to put this in for you tomorrow (unless you get to it first!) Thanks for the good suggestion.
hey @howardpchen ! I've done a PR to work on this issue. Would you be able to review when you get a chance? https://github.com/pydicom/deid/pull/88 No rush, it's Sunday firstly, and I'm going to go out for a run and will be back later anyway :) Happy weekend!
My group's use case includes passing a function to all DICOM Tags to check for certain PHI words/phrases. Noticed that while we can do startswith and endswith, there isn't an easy way to apply func:my_function on most or all of the DICOM tags. Would it be of interest if I expanded the deid/dicom/fields.py functionality (e.g. add expander words like "all," "contains," "allexcept" etc) and sent out a pull request? Thanks!