Closed RazrFalcon closed 4 years ago
These libraries are incorrect, they are not recursively normalizing the character. Please file bugs on them.
Uniview matches what we do.
But Uniview
says: Character decomposition mapping: 0DDC 0DCA
@RazrFalcon yes, it decomposes twice, U+0DDC decomposes again.
I was talking about the NFD button on the text box, which does the right thing
Is there a way to disable recursive normalization to get the results I'm looking for?
No. There is no such thing as non-recursive normalization, the normalization algorithm is recursive. What you want is the direct mapping that's in the unicode tables, which is intermediate data and not as useful. This crate does not contain that data since we handle the recursive bit in the script step.
Both hb_ucd_decompose
and unicodedata.decomposition
are data table lookup APIs, primarily to be used to write a proper decomposition algorithm. You shouldn't be using these APIs directly: what are you attempting to do?
I see. I understand now. Thanks for the help.
Decomposing the
0x0DDD
character:decompose_canonical
/decompose_compatible
: 0x0DD9 0x0DCF 0x0DCAharfbuzz::hb_ucd_decompose
: 0x0DDC 0x0DCAunicodedata.decomposition
: 0x0DDC 0x0DCAIs this a
unicode-normalization
bug or am I using it wrong?Reproduces on stable release and on master.