pydicom / deid

best effort anonymization for medical images using python

https://pydicom.github.io/deid/

MIT License

140 stars 43 forks source link

adding stop_before_pixels to read_file of dicom parser #164

Open vsoch opened 3 years ago

vsoch commented 3 years ago

Signed-off-by: vsoch vsochat@stanford.edu

Description

Adding stop_before_pixels to the Parser, to be used by get_identifiers

Related issues: #163

Open questions

is this supported across versions of pydicom?
are there other read_file parameters that might want to be passed, so we should ask for a dictionary of arguments instead?
are there other instances of read_file that a user might want this added to?
should the default be False?

allanhummer commented 3 years ago

Thanks for picking this up!

A dictionary of arguments would certainly be a more flexible method. I also think that other arguments of dcmread like "defer_size" or "specific_tags" could be useful.

I think False should be the default, as otherwise the image information would be missing if one does a simple load -> edit header -> save procedure.

Not really part of core DEID, but an instance where the "stop_before_pixels" argument is also useful is here: create-dicom-csv.py Duplicate of Duplicate of ##

vsoch commented 3 years ago

@allanhummer two possible suggestions for this PR:

add the flag to the example you provided
Instead of writing out specific args,have a dict for pydicom_opts (or similar) and expand it into the command. This would be flexible to any set of arguments / change over time for pydicom.

Let me know your thoughts!

vsoch commented 3 years ago

@wetzelj what are your thoughts for this PR here? I want to make sure it doesn't get too dated to make integrating hard in case you think it's useful too. But if nobody needs it, probably okay to just let be until someone asks again :)

wetzelj commented 3 years ago

Hey @vsoch- On the surface, I think it's a very logical inclusion. If you don't need the pixel data, why read it?

However, in my implementation, I'm not utilizing get_identifiers at all - I only use replace_identifiers. Initially, I was thinking that maybe it would be good to expose the stop_before_pixels to replace identifiers as well, but ultimately, (for my purposes) it wouldn't make sense. If we were to run with stop_before_pixels=True, since parser.dicom is what gets saved as the replaced dicom file, we'd end up with a whole mess of files that were header only, no actual images.

While it won't be immediately beneficial to me, I think it would be a fine inclusion as it's currently coded in the pull request as an additional parameter on get_identifiers(), but honestly, I could go either way... merge the PR or let it sit until someone needs it.

vsoch commented 3 years ago

merge the PR or let it sit until someone needs it.

Yeah, I'm good to do the latter - keep here for discussion until someone needs it. Thanks @wetzelj !

opennog commented 7 months ago

Hi I have a question here. Are you aware if the reading time when ´stop_before_pixels=True´ is quicker than when stop_before_pixels=False?

Thanks!

vsoch commented 7 months ago

I am not, but you could try it out!