w3c / findtext

An API spec to define how to find text in a Web document, using basic information, and return DOM ranges
14 stars 7 forks source link

Unicode equivalence type non-specificity [I18N-ISSUE-501] #8

Open aphillips opened 9 years ago

aphillips commented 9 years ago

http://www.w3.org/International/track/issues/501 [I18N-ISSUE-501]

http://www.w3.org/TR/2015/WD-findtext-20151015/#idl-def-UnicodeEquivalenceType.canonical

The unicodeEquivalentType parameter provides four possible parameters. Three of these (canonical, compatibility, and all) allow either C or D forms rather than specifying which shall be used. Because of the nature of what FindText is doing (finding matching code point sequences), the results may vary depending on which normalization form is chosen. The FindText API should either (a) be specific about whether the composed or decomposed sequence is used or (b) define separate terms for composed and decomposed (that is, provide four Unicode normalizations: NFC, NFD, NFKC, and NFKD).