Closed Artoria2e5 closed 7 years ago
"Overlong" means unnecessary long.
With/without CESU and strict/loose UTF-8, which combination should be the default behavior? All 4 combinations can be in the same decoder https://github.com/buganini/bsdconv/blob/master/modules/from/_UTF-8.c with parameters like _UTF-8#strict/_UTF-8#loose/_UTF-8#cesu/_UTF-8#nocesu
"Strict" UTF-8 w/o CESU should be made default.
For UTF-8, the "overlong" means unnecessary long (like using \xC1\xA1 as 'a') or code point over U+10FFFF?