Open NorbertLindenberg opened 4 years ago
@NorbertLindenberg, Good point. These default to BASE_IND. Certainly, better category assignments might be made for some of these.
Document has been updated.
The documentation now says that all characters with UISC = Other go into BASE_IND. That doesn’t work because there are characters that have UISC = Other but have already been assigned to other classes for good reasons. One example are the Balinese musical symbols 1B61..1B6A and 1B74..1B7C, which have general category So and therefore USE class SYM, which allows the combining marks 1B6B..1B73 to attach to them.
The intent probably is that BASE_IND gets all characters with UISC = Other that haven’t been claimed for other USE classes already. The USE class OTHER uses “Any other SCRIPT_COMMON characters” to describe a similar fallback. However, any such description would have to clarify the classification of characters that have both Script Common and UISC Other. Or maybe the classes BASE_IND and OTHER can simply be merged into one? Is there any reason to keep them separate?
The USE description doesn’t say how Brahmic characters are interpreted for which the Unicode character data doesn’t define an Indic Syllabic Category, or whose ISC is the default value Other, and whose script is not Common. A few such characters in Unicode 13 are U+0971, U+ 0950, U+A8F4..U+A8F7, U+A8FB, U+A8FD, U+1CED, U+1CE2..U+1CE8. (Some of these characters probably should have different syllabic categories, and I’ve reported this issue also to the Unicode Consortium.)
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.