Converts wiktionary data from https://kaikki.org/ to yomitan-compatible dictionaries. Converted dictionaries can be found in the Downloads section.
(examples use German (de) to English (en))
Create a .env
file based on .env.example
.
If your language is not in languages.json
, add it.
Run ./auto.sh German English
.
Dictionaries should be in data/language/de/en
.
Instead of a language name, you can also write ?
to run for all languages.
./auto.sh ? English
will run for any language to English../auto.sh German ?
will run for German to any language.The auto.sh
script can also be run with flags:
Most often, you will want to run ./auto.sh German English kty
to recreate the dictionaries, then load them in yomitan and test them.
After a run, data/language/de/en
should contain files with skipped tags for IPA and terms. Adding some to tag_bank_ipa.json
or tag_bank_term.json
is an easy way to improve the conversion for your language pair.
Test inputs are in data/test/kaikki
. Each line is a line from the corresponding kaikki file (from data/kaikki
, after downloading).
To fix something in the conversion of a word, add its line from data/kaikki
to the corresponding test file in data/test/kaikki
.
Then run npm run test-write
to add it to the expected test output, and commit the changes (e.g. add baseline test for "word"
).
Now when you modify tidy-up or make-yomitan, you can run npm run test-write
to see the changes you made.
If you are making a change that shouldn't change the output, just run npm run test
to check if anything broke.