Open clarfonthey opened 6 years ago
To get this mapping, you need two things:
1) which characters map to ASCII digits via NFKC, 2) the ranges of characters with (decimal) numeric values.
From these, you can have all the range of numeral that map to ASCII. This mostly gives you what you want.
IIRC, one of the subscripts/superscripts (or one of the other ranges) does not have the NFKC mapping, which would need special attention.
That said, we haven't worked on unic-ucd-numeric
yet, which would be the solution to (2).
For (1), unic-ucd-normal
should have the data already.
It'd be useful if
unic
had an API that converted bidirectionally between unsigned integers and various scripts. For example, here are a few representations of the number twelve:12
"12"
"₁₂"
"¹²"
"𝟙𝟚"
I recall seeing a crate that did something similar but I couldn't find it; I think having this kind of thing would be useful in
unic
.