pydicom / deid

best effort anonymization for medical images using python
https://pydicom.github.io/deid/
MIT License
140 stars 43 forks source link

KeyError Exception when stripping sequences #137

Closed wetzelj closed 4 years ago

wetzelj commented 4 years ago

When stripping sequences from DICOM headers, when a given sequence element exists in multiple locations in the header - or when sequences are nested, the del dicom[elem.tag] statement in tags.py produces a KeyError. In the CtBrain1.dcm sample file, this scenario can be seen with tag (0008, 1110) which occurs as a top-level sequence but then also occurs as a nested item within a sequence further down in the header.

I've created a test method which demonstrates this bug and implemented a two possible fixes in this commit. Let me know if which fix (if either) you prefer and if you'd like me to submit it as a pull request.

Hi @vsoch! I'm back! :)

vsoch commented 4 years ago

hey @wetzelj welcome back!!

I think solution A (the one not commented out) looks great - if we check for None then I don't think we will hit that key error. I vote for A, and we can adjust if there is some bug report about it.

When you open the PR (or put directly in the docstring, that's probably better) please briefly explain the test (e.g., I think we'd want to make sure that a nested value is properly removed). And I suspect we will hit other little bugs with sequences, but I'm happy to work on them as they come - smaller, well scoped PRs are good! Looking forward to your PR! Don't forget to update the changelog and do a version bump :)

wetzelj commented 4 years ago

Closing... Fixed with PR #138