This PR uses Thai language and encodings to illustrate the problem, add test cases, and fix the bug by first attempting a lookup of the original label, then (if not found) attempting a lookup of the normalised label, and (if not found again) raising an informative error. Assuming that most labels are correct and do not need the string replace, this should be the fastest approach.
Description
As reported by @nikonov1101 in https://github.com/mnako/letters/issues/49,
CharsetReader
was not performing a correct lookup for labels that do start withwindows-
.This PR uses Thai language and encodings to illustrate the problem, add test cases, and fix the bug by first attempting a lookup of the original label, then (if not found) attempting a lookup of the normalised label, and (if not found again) raising an informative error. Assuming that most labels are correct and do not need the string replace, this should be the fastest approach.
Commits:
decoders.decodeHeader.CharsetReader
to lookup the original label, replacewindows-
withcp
only if not found, and raise informative error, if not found again and show all test cases passing again.