pydicom / deid

best effort anonymization for medical images using python
https://pydicom.github.io/deid/
MIT License
140 stars 43 forks source link

Default VR for added private tags #146

Closed wetzelj closed 4 years ago

wetzelj commented 4 years ago

When discussed in #127, we decided to use DICOM VR "UN" (undefined) as the default data type for a private tag. While this seemed to be okay when tested via our current unit tests, this actually causes an exception when calling pydicom FileDataset.save_as to write the output DICOM file due to the fact that the UN datatype is expecting bytes, not a string.

'With tag (1111, 2221) got exception: a bytes-like object is required, not \'str\'

Would you be open to defaulting new private tags to one of the string datatypes?

If you think this is okay, I'll make the change and add a couple unit tests.

vsoch commented 4 years ago

We should likely choose the one that is reasonable and has least potential to spit an error. If we were to do some kind of text, would adding a number spit out an error? Short text seems reasonable to not be too short or too long, and we'd want to discuss if it would work for other types too (or if we should check for the type in python and choose based on that).

wetzelj commented 4 years ago

Based on what I can tell from some initial evaluation, the ST type will be accepting of most variants we would see. Its accepted positive/negative integers, decimals, and even a string with lengths greater than the limits defined in the standard.

I don't think it's necessarily worth checking in python and dynamically setting the type. If we could definitively determine types, then maybe - but given the way the VRs are defined, we'd have to make some ambiguous decisions.

vsoch commented 4 years ago

Ok, let’s go with short text then!

vsoch commented 4 years ago

Closed with #148, type changed to short text.