Closed hugolpz closed 7 years ago
What about Unihan?
With the hexadecimal codepoint we can get the glyph like this in Python:
>>> print(chr(int('0x897F', 16)))
西
A JS solution would be better, but this is out of the scope of the project, we can do it anyway we think fits.
Please check out :
Thanks for the link cjk-unihan might be useful for other projects.
I think it's better to limit the project to generating font and outsource the data gathering/validation to another project. This way we stay focus and efficient.
I'm closing as different users might have different needs hence handcraft their dictionaries.
I reckon the JS solution is in tobei/unihan code
const character = String.fromCodePoint(parseInt(code.substring(2), 16));
Did you gathered the data ?
Not yet, could you work on a project to do so?
@hugolpz I think you have a typo in your comment, there is a ratio of 1:10 between node-pinyin and unihan characters/phonetic pairs. Can you confirm/correct this number?
We can get the codepoint using punycode
We currently look for database with
{ "glyph": "西", "phonetic": "xī" }
(orxi1
, or alternatives).Sources possible, info to complete :
Moedict
Unicode :
CJKlib