Closed the-moisrex closed 1 month ago
Seems like the decomposition gets the correct answers for \xFFC4
and \x1F133
that I initially though it would be wrong!
A canonical mapping may also consist of a pair of characters, but is never longer than two characters. When a canonical mapping consists of a pair of characters, the first character may itself be a character with a decomposition mapping, but the second character never has a decomposition mapping.
from UTS #44
This is the problem with the algorithm now.
It works now.
toNFD don't work, either decompose is the problem or the canonical_reorder algorithm.
I've added a disable option for utf-8 composition tests since they don't work yet but I need the UTF-32 versions to work perfectly before I start dealing with that mess.