-
It might be possible to add in data from the Unihand database too.
Character search interface: https://unicode.org/charts/unihan.html
Unihan download here: https://www.unicode.org/Public/UCD/lates…
-
Provides access to:
- `Unihan_DictionaryIndices`
- `Unihan_DictionaryLikeData`
- `Unihan_IRGSources`
- `Unihan_NumericValues`
- `Unihan_OtherMappings`
- `Unihan_RadicalStrokeCounts`
- `Unihan_R…
-
Moved here from CLDR-9775 where @macchiati wrote in 2016:
The code is pretty crufty, since it was mostly designed to synthesize data from different sources before kMandarin and kTotalstrokes were e…
-
I have found that I always need to convert the data into a dictionary (instead of the default list) when I'm using it. Because of this, I decided to always store the file in dictionary format. My meth…
-
Found a great place to use zhon's symbol lists. Parsing regular expressions out of UNIHAN.
https://github.com/cihai/unihan-etl
https://github.com/cihai/unihan-etl/blob/335441a/unihan_etl/expansi…
-
Processing Unihan data (`Unihan_IRGSources.txt`, `Unihan_Readings.txt`, and `Unihan_Variants.txt`) into the ReverseMap, VariantRadicals, and BaseRadicals in the browser already takes more than 10 seco…
-
The Unihan database currently contains 431 instances of characters described as their own variants. This is logically inconsistent. The correct traditional variants should of course remain, but the lo…
-
Currently we include the value of the unihan kFrequency field for a character in the popup output.
According to the unihan database, this is: "A rough frequency measurement for the character based …
-
CollationHanDatabase is a bit of a misnomer. Consider CollationHanDefault instead.
https://github.com/unicode-org/icu4x/issues/2709#issuecomment-1284604950
CC @markusicu
-
The module is very convenient, but it stays at Unicode 5.1.0, any plan to upgrade to latest Unicode standard? or any recommended module for latest Unihan database?
Thanks!