Closed syvb closed 3 months ago
There's no issue here. GC
in the code does not stand for "General Category", but for GraphemeCat
. The categories are read from https://www.unicode.org/Public/UNIDATA/auxiliary/GraphemeBreakProperty.txt, which contains the right values. This can be closed
UAX #29 defines
SpacingMark
as:In this crate's implementation of rule GB9a, only the "General_Category = Spacing_Mark" part is checked. This crate doesn't check that Grapheme_Cluster_Break ≠ Extend or implement any of the 24 exclusions or 2 inclusions. The impact of this is very minor though, since it only affects a small set of characters, and only in extended mode.
(originally noted in #107)