ImagingDataCommons / highdicom

High-level DICOM abstractions for the Python programming language
https://highdicom.readthedocs.io
MIT License
180 stars 37 forks source link

Checks that values satisfy requirements of the VR #136

Open CPBridge opened 3 years ago

CPBridge commented 3 years ago

This has been discussed before (e.g. here) but we should have an issue to track it.

Pydicom is very loose in what it allows you to set as an attribute's value, even when you have the global configuration option pydicom.config.enforce_valid_values set to True. We have previously encountered and resolved this narrowly for decimal strings (DS) #57 #65, but the issue is broader. Checks for the other VRs are largely absent from pydicom, but many VRs have limits on length of the string, list of allowable characters, capitalisation, etc (see standard). The result is many one-off checks being including to check user-supplied values in highdicom, as well as probably many missed checks that could allow files with invalid values to be produced. We should tackle this in a more unified way to reduce redundancy and probability of invalid values slipping through the net.

My feeling is that as far as possible we should add this functionality to pydicom and then integrate into highdicom.

hackermd commented 3 years ago

See also https://github.com/pydicom/pydicom/issues/1414

hackermd commented 3 years ago

We should avoid the global configuration as much as possible. My preferred approach would be to add a parameter to the constructor of Dataset and friends (Sequence, etc.) such as enforce_valid_vr. That should allow us to enforce correct encoding when creating new objects while potentially allowing some invalid values when reading existing objects.

CPBridge commented 2 years ago

A note here that pydicom 2.3.0 has a lot of new functionality to help us here, we just need to figure out how to best make use of it