w3c / i18n-glossary

Definitions of terms used in W3C Internationalization documents.
https://w3c.github.io/i18n-glossary/
5 stars 4 forks source link

Superset of Unicode? #14

Closed xfq closed 10 months ago

xfq commented 2 years ago

https://www.w3.org/TR/i18n-glossary/#def_universal_character_set

Other character sets are subsets of Unicode.

I wonder if that's really the case since obviously there are exceptions like Emacs:

himorin commented 2 years ago

might be better to include elisp one also? (extended for backward compatibility,,,) https://www.gnu.org/software/emacs/manual/html_node/elisp/Text-Representations.html

aphillips commented 11 months ago

It's really true that other character sets are subsets of Unicode.

The emacs situation is that they have a larger code space than Unicode (which they partially get by using the original UTF-8 design that allowed up to six bytes), but the additional code space really isn't useful in any practical sense. It's also good to ignore such claims, given that we don't really want people thinking there is some other encoding source of truth. There aren't character sets with more encoded characters than Unicode.

What is the proposal here?

xfq commented 10 months ago

Fair enough. Closing this issue.