skishore / makemeahanzi

Free, open-source Chinese character data
https://www.skishore.me/makemeahanzi/
Other
1.82k stars 466 forks source link

Calligraphy-free stroke data? #76

Closed DonaldTsang closed 3 years ago

DonaldTsang commented 4 years ago

https://github.com/KanjiVG/kanjivg is a great resource for Kanji, but as their name suggests they only support Japanese Kanji and does not include Traditional Chinese or Simplified Chinese.

Is it possible to add support for that, as I would like to recreate http://blog.otoro.net/2015/12/28/recurrent-net-dreams-up-fake-chinese-characters-in-vector-format-with-tensorflow/ and http://otoro.net/kanji/ ?

Also asking the same question at https://github.com/KanjiVG/kanjivg/issues/187

skishore commented 3 years ago

The "medians" field in the dictionary.txt rows is my attempt to extract the "core" of each stroke.

It's not perfect, but it's accurate enough that handwriting recognition algorithms using that field work well.

DonaldTsang commented 3 years ago

Thanks!