soasis / text

A spicy text library for C++ that has the explicit goal of enabling the entire ecosystem to share in proper forward progress towards a bright Unicode future.
https://ztdtext.readthedocs.io/en/latest/
Other
316 stars 25 forks source link

Don't Cursed Open Inside #21

Open ThePhD opened 3 years ago

ThePhD commented 3 years ago

This is a running list of all the (mildly to extremely) cursed encodings, and whether or not we should implement them. More can be suggested on Twitter here Here goes:

Some that might not be possible within the framework of this library:

marzojr commented 1 year ago

For what is worth, the Unicode Consortium published conversion tables for many of those encodings; conversion to unicode from these encodings end up being going through lookup tables; conversion back is likely the same for "properly normalized" unicode.

The data can be found here: https://github.com/unicode-org/icu-data.

ThePhD commented 1 year ago

Yeah, I've seen that!

For what it's worth, I've already started working on lookup tables for most of the single and double-byte encodings. Albeit, they're not derived from the icu data, but from other sources.

See here: https://github.com/soasis/encoding_tables

marzojr commented 1 year ago

Oh, nice! I was going by the your encoding docs, which, I guess, are out of date then.