mercure-imaging / mercure

mercure DICOM Orchestrator
https://mercure-imaging.org
MIT License
67 stars 32 forks source link

Unable to perform character set conversion / Incoming charset is ISO 2022 IR 100 #59

Closed Bordasludovic closed 3 weeks ago

Bordasludovic commented 1 year ago

Hello,

I have this error message in a loop on one of my DICOM routers.

I'm under the impression that a facility has recently been sending us DICOM images with tags whose characters are non-convertible.

oct. 27 10:14:11 tlg-dcomr11 receiver.sh[742888]: ERROR: Unable to perform character set conversion! oct. 27 10:14:11 tlg-dcomr11 receiver.sh[742888]: ERROR: Incoming charset is ISO 2022 IR 100

Have you an idea to fix that ?

alipairon commented 1 year ago

Got the same problem approx a year ago. The recompiling dcmtk storescp and getdcmtags with libiconv lib solves this.

Bordasludovic commented 1 year ago

I think it's because of the following characters that are contained in several DICOM tags in the two files stuck in the incoming folder: { and }

rogerbramon commented 7 months ago

We're also experiencing this issue. Have you discovered anything about its cause? Could you share how to resolve it?

Thanks!

alipairon commented 7 months ago

Check my last post here: https://github.com/mercure-imaging/mercure/issues/52

richrosenbaum commented 7 months ago

Hi - I'm running 0.2.0-beta4. Everything's been going fine for a long while, but similarly all of a sudden this started showing up all the time: ERROR: Unable to perform character set conversion! ERROR: Incoming charset is ISO 2022 IR 6

I looked briefly at all the dicom tags and did not see any funky characters that I wouldn't expect.

Is the recommended fix to follow "getdcmtags compilation guide"?

Thanks very much, Rich

RoyWiggins commented 1 month ago

@richrosenbaum @rogerbramon @Bordasludovic

We suspect that this issue may be caused by invalid DICOMs. The dicom tag SpecificCharacterSet operates in two modes: with "code extensions", and without. "ISO 2022" character set names only work in the former mode (in "code extension" mode). The mode is selected depending on whether SpecificCharacterSet has one or more character sets specified; if there is only one, it is not parsed in Code Extension mode, and the ISO 2022 character set names are invalid. @richrosenbaum 's dicom should probably have had either SpecificCharacterSet=ISO_IR 6 or SpecificCharacterSet=\\ISO 2022 IR 6. More details here.

I have committed a change that tries to recover from this issue by pretending, eg, SpecificCharacterSet=ISO 2022 IR 6 is reallySpecificCharacterSet=\\ISO 2022 IR 6, on the theory that the file was meant to be written that way (with two character sets, therefore in Code Extension mode; and the first character set blank, therefore ASCII). If this assumption isn't correct it may result in Mercure seeing garbled tags for this DICOM file, so any rules based on those tags could fail to trigger properly.

This change doesn't update the dicoms on disk, so further processing of dicoms with invalid SpecificCharacterSet may also not work properly.

You can test this by either updating to HEAD (which may break unrelated things) or just grabbing the new Ubuntu 22 binary and inserting it into your application, which should work, though you may also have to update to the latest receiver.sh also.

You can manually test this on a DICOM file by running getdcmtags on it and seeing if it errors. Let us know if you get a chance to test this on any known-bad DICOMs and whether it resolves the problem!

tblock79 commented 3 weeks ago

Closing this issue as the problem should be addressed in the latest version. Please re-open if the fix does not solve the problem (difficult to test for us, as this seems to be caused by DICOM files with invalid encoding settings).