Open ThePhD opened 3 years ago
For what is worth, the Unicode Consortium published conversion tables for many of those encodings; conversion to unicode from these encodings end up being going through lookup tables; conversion back is likely the same for "properly normalized" unicode.
The data can be found here: https://github.com/unicode-org/icu-data.
Yeah, I've seen that!
For what it's worth, I've already started working on lookup tables for most of the single and double-byte encodings. Albeit, they're not derived from the icu data, but from other sources.
Oh, nice! I was going by the your encoding docs, which, I guess, are out of date then.
This is a running list of all the (mildly to extremely) cursed encodings, and whether or not we should implement them. More can be suggested on Twitter here Here goes:
MULE_INTERNAL (Multilanguage Emacs internal encoding)Garbage encoding for an even more garbage text editor.UTF-EBCDICThis may be patent-encumbered or license-checked, and therefore cannot be implemented.UTF-7This may be patent-encumbered or license-prohibited, and therefore cannot be implemented.UTF-7-IMAPThis may be patent-encumbered or license-prohibited, and therefore cannot be implemented.UTF-1Not a good encoding.Some that might not be possible within the framework of this library:
encode_one
/decode_one
limitations potentially useless? Needs more research