ilius / pyglossary

A tool for converting dictionary files aka glossaries. Mainly to help use our offline glossaries in any Open Source dictionary we like on any modern operating system / device.
GNU General Public License v3.0
2.23k stars 237 forks source link

Support reading KANJIDIC #386

Open epistularum opened 2 years ago

epistularum commented 2 years ago

JMDict and JMnedict can both be extracted already, so perhaps there's a reason Kanjidic isn't supported but I believe it would be a useful addition.

https://www.edrdg.org/wiki/index.php/KANJIDIC_Project

edit: Just realized that the conversion of JMnedict is not supported. It does accept the format and converts it but it only exports the writings and readings, discarding any other information. I believe a partial export is only possible because of how similar the file format is to JMdict. Perhaps it is also worth adding support for?

ilius commented 2 years ago

Where can I download JMnedict files?

epistularum commented 2 years ago

Sorry about that, here it is: https://www.edrdg.org/wiki/index.php/Main_Page#The_ENAMDICT/JMnedict_Project

ilius commented 2 years ago

Added JMnedict. Please try and let me know if it's all good (data, formatting etc).

epistularum commented 2 years ago

It looks good, thank you very much for adding support for it!

ilius commented 1 year ago

This library's code should be useful: https://github.com/neocl/jamdict

homocomputeris commented 9 months ago

Some relevant projects: https://github.com/scriptin/jmdict-simplified - converts KANJIDIC XML to JSON (Kotlin) https://github.com/tim-harding/kanjidic_utilities/tree/master/kanjidic_converter - converts KANJIDIC XML to JSON (Rust)