Closed rdeltour closed 1 year ago
Referencing previous discussion around this in #1885 #1899
Ya, it seems we ended up with E007F deprecated despite wanting to allow emoji sequences...
I wonder if we can remove that bullet to avoid the redundancy of restricting each code point that unicode already deprecates (and the future maintenance it entails). Maybe we can use the file you've referenced @rdeltour to create a new one at the end of the list, like:
Thoughts @iherman @xfq @r12a ?
For someone who has never looked at a Unicode listing closely... @mattgarrish I presume you refer to these lines in the file you referred to:
0149 ; Deprecated # L& LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
0673 ; Deprecated # Lo ARABIC LETTER ALEF WITH WAVY HAMZA BELOW
0F77 ; Deprecated # Mn TIBETAN VOWEL SIGN VOCALIC RR
0F79 ; Deprecated # Mn TIBETAN VOWEL SIGN VOCALIC LL
17A3..17A4 ; Deprecated # Lo [2] KHMER INDEPENDENT VOWEL QAQ..KHMER INDEPENDENT VOWEL QAA
206A..206F ; Deprecated # Cf [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES
2329 ; Deprecated # Ps LEFT-POINTING ANGLE BRACKET
232A ; Deprecated # Pe RIGHT-POINTING ANGLE BRACKET
E0001 ; Deprecated # Cf LANGUAGE TAG
Maybe it is worth making a note on how to read that reference...
I presume you refer to these lines in the file you referred to:
Right, I presume that list is all of them. I checked some of the other files but it appears the deprecated ones have been consolidated there.
It would have been nice if there were an HTML equivalent with a direct link, but searching around I couldn't find one. If there's another reference we could use, though...
The alternative, of course, is we say nothing about deprecated code points and assume that epubcheck should be warning about them, because, well, they're already deprecated by the official standard. That would be even better.
I think that, spec-wise, we should keep this in the spec. It would be strange if epubcheck defined the spec...
Ya, but it's back to that basic question we've bumped into a couple of times now of whether we need to restrict people from using things that are already deprecated by their respective specifications. Epubcheck would only be reporting what unicode defines.
But I'm fine either way.
Ya, but it's back to that basic question we've bumped into a couple of times now of whether we need to restrict people from using things that are already deprecated by their respective specifications. Epubcheck would only be reporting what unicode defines.
But I'm fine either way.
That is also correct...
We could also consider an approach whereby we put, instead of the bullet point in the normative text as above, a note whereby authors should also abide to any restrictions dictated by Unicode (who knows, they may come up, at some point, with a different notion than "deprecated"), put deprecation as an example?
We can also toss a coin. :-)
a note whereby authors should also abide to any restrictions dictated by Unicode
Ya, I like this approach. I'll see what I can come up with.
EPUB OCF says U+E007F is disallowed as one of the two deprecated characters in the Tags and Variation Selectors Supplement.
But E+E007F CANCEL TAG was reinstated as non-deprecated in Unicode 9.0, see the change history for the Unicode Character Database
See also the up-to-date list of deprecated characters in the latest UCD PropList.txt file (search for "Deprecated").