-
It might be possible to add in data from the Unihand database too.
Character search interface: https://unicode.org/charts/unihan.html
Unihan download here: https://www.unicode.org/Public/UCD/lates…
-
The Unihan database currently contains 431 instances of characters described as their own variants. This is logically inconsistent. The correct traditional variants should of course remain, but the lo…
-
The module is very convenient, but it stays at Unicode 5.1.0, any plan to upgrade to latest Unicode standard? or any recommended module for latest Unihan database?
Thanks!
-
CollationHanDatabase is a bit of a misnomer. Consider CollationHanDefault instead.
https://github.com/unicode-org/icu4x/issues/2709#issuecomment-1284604950
CC @markusicu
-
Hi,
I have found an instance of bad data in the database. I guess there could be more. Should the UniHan data be automatically cleaned before importing?
```
from cihai.core import Cihai
from c…
-
```
目前碼表也基本夠用(除了地球拼音略小些),但 rimeime
是精益求精的輸入法,可以更好。
這個壓縮包裏: http://www.unicode.org/Public/UNIDATA/Unihan.zip 有個
Unihan_reading.txt
。把粵唐日韓越去掉後,可得到四萬多字的漢語拼音,帶聲��
�。
中研院漢字構形資料庫:
http://cdp.sinica.ed…
-
Currently we include the value of the unihan kFrequency field for a character in the popup output.
According to the unihan database, this is: "A rough frequency measurement for the character based …
-
Currently we display 4 possible pronunciation fields:
pinyin (from CEDICT)
mandarin (from Unihan -- which is almost always the same as pinyin but it seems not always)
cantonese (from Unihan)
tan…
-
```
目前碼表也基本夠用(除了地球拼音略小些),但 rimeime
是精益求精的輸入法,可以更好。
這個壓縮包裏: http://www.unicode.org/Public/UNIDATA/Unihan.zip 有個
Unihan_reading.txt
。把粵唐日韓越去掉後,可得到四萬多字的漢語拼音,帶聲��
�。
中研院漢字構形資料庫:
http://cdp.sinica.ed…
-
```
目前碼表也基本夠用(除了地球拼音略小些),但 rimeime
是精益求精的輸入法,可以更好。
這個壓縮包裏: http://www.unicode.org/Public/UNIDATA/Unihan.zip 有個
Unihan_reading.txt
。把粵唐日韓越去掉後,可得到四萬多字的漢語拼音,帶聲��
�。
中研院漢字構形資料庫:
http://cdp.sinica.ed…