DIAGNijmegen / rse-panimg

Conversion of medical images to MHA and TIFF.
Apache License 2.0
13 stars 5 forks source link

Option to disable header validation #70

Closed nlessmann closed 2 years ago

nlessmann commented 2 years ago

We sometimes get data with invalid headers, often caused by anonymization software - for example, in DICOM files fields like PatientBirthDate might are often replaced by a string like "ANONYMIZED". That is invalid DICOM in principle, but we still need to be able to read those files (mainly outside of grand challenge).

It would be useful to be able to turn off header validation, or to have panimg ignore (=remove) invalid headers instead of rejecting the image.

jmsmkn commented 2 years ago

I recognize the issue but honestly, I don't want to add this option. Validation and standardization is the key aim of this library, this is why we use pydantic under the hood. I would rather people fix their deidentification scripts to produce valid DICOM. I wonder if this should be solved elsewhere and you can provide a method to re-write the headers to valid DICOM before passing it to panimg?

I would much prefer an alternative, but if we did this option on grand challenge we would always want to validate the headers. Responsibility for header validation lies in the individual image builders but we do not currently have a way of passing settings down to them. In designing that, we would need to think if a header (or metadata) validation option is something that we want all image builders to support, or if we want to have the setting to turn off validation for individual headers.

nlessmann commented 2 years ago

Makes sense - what do think about just not passing on invalid headers instead of rejecting the image entirely?

jmsmkn commented 2 years ago

I still think that the library should reject it and leave it for the user to decide what to do. panimg is validating the image and the metadata and I think users would be surprised if metadata went missing because we filled it out with some other value or dropped it entirely.