Korean character support

smikitky commented 6 years ago

Currently this extension cannot handle Korean characters (Hangul). What DICOM standard says is as follows:

Use KS X 1001 character set, formerly known as KS C 5601 and registered as ISO-IR 149
Use ISO-2022-KR encoding which uses escape sequences
- G0 set ESC 02/04 02/08 04/03
- G1 set 01/11 02/04 02/09 04/03 or ESC $ ) C (i.e., "Use Hangul from here")

The problems is that ISO-2022-KR is a very rare encoding and I cannot find a pure-JS decoder for that. Seemingly EUC-KR is very similar, and CP949 is a superset of EUC-KR. A hacky solution would be to just decode the text with CP949 and remove the 4-byte escape sequence using regex. This works at least in this example, but I don't know if it's the right approach.

smikitky commented 5 years ago

Okay, it turned out that EUC-KR and ISO-2022-KR are almost identical except that the latter implicitly invokes multibyte characters into the G1 area whereas the latter requires you to explicitly invoke them using the four byte sequence ESC $ ) C. This means we can simply remove the escape sequence and treat it as a string in EUC-KR.

smikitky commented 5 years ago

Fixed in 1.2.0

smikitky / vscode-dicom-dump

Korean character support #2