JMdict, JMnedict, Kanjidic, and Kradfile/Radkfile in JSON format
with more comprehensible structure and beginner-friendly documentation
Original XML files are less than ideal in terms of format. (My opinion only, the JMdict/JMnedict project in general is absolutely awesome!) This project provides the following changes and improvements:
null
(with few exceptions) and missing fields, preferring empty arrays.
See http://thecodelesscode.com/case/6 for the inspiration for thisSee the Format documentation or TypeScript types
Please also read the original documentation if you have more questions:
There are also Kotlin types, although they contain some methods and annotations you might not need.
There are three main types of JSON files for the JMdict dictionary:
/k_ele/ke_pri
or /r_ele/re_pri
elements in XML files contain
one of these markers: "news1", "ichi1", "spec1", "spec2", "gai1".
Only one such element is enough for the whole word to be considered common.
This corresponds to how online dictionaries such as https://jisho.org
classify words as "common". Common-only distributions are much smaller.
They are marked with "common" keyword in file names, see the latest releaseAlso, JMdict and Kanjidic have language-specific versions with language codes (3-letter ISO 639-2 codes for JMdict, 2-letter ISO 639-1 codes for Kanjidic) in file names:
all
- all languages, i.e. no language filter was appliedeng
/en
- Englishger
/de
- Germanrus
/ru
- Russianhun
/hu
- Hungariandut
/nl
- Dutchspa
/es
- Spanishfre
/fr
- Frenchswe
/sv
- Swedishslv
/sl
- SlovenianJMnedict and JMdict with examples have only one respective version each, since they are both English-only, and JMnedict has no "common" indicators on entries.
You don't need to install Gradle, just use the Gradle wrapper provided in this repository:
gradlew
(for Linux/Mac) or gradlew.bat
(for Windows)
NOTE: You can grab the pre-built JSON files in the latest release
Use included scripts: gradlew
(for Linux/macOS) or gradlew.bat
(for Windows).
Tasks to convert dictionary files and create distribution archives:
./gradlew clean
- clean all build artifacts to start a fresh build,
in cases when you need to re-download and convert from scratch./gradlew download
- download and extract original dictionary XML files into build/dict-xml
./gradlew convert
- convert all dictionaries to JSON and place into build/dict-json
./gradlew archive
- create distribution archives (zip, tar+gzip) in build/distributions
Utility tasks (for CI/CD workflows):
./gradlew --quiet jmdictHasChanged
, ./gradlew --quiet jmnedictHasChanged
,
and ./gradlew --quiet kanjidicHasChanged
- check if dictionary files have changed
by comparing checksums of downloaded files with those stored in the checksums.
Outputs YES
or NO
. Run this only after download
task!
The --quiet
is to silence Gradle logs, e.g. when you need to put values into environments variables../gradlew updateChecksums
- update checksum files in the checksums directory.
Run after creating distribution archives and commit checksum files into the repository,
so that next time CI/CD workflow knows if it needs to rebuild anything../gradlew uberJar
- create an Uber JAR for standalone use (i.e. w/o Gradle).
The JAR program shows help messages and should be intuitive to use if you know how to run it.For the full list of available tasks, run ./gradlew tasks
download
-> convert
-> archive
java
is available on your $PATH
environment variable--stacktrace
, --info
, or --debug
arguments to see more details
if you get an errorThe original XML files - JMdict.xml, JMdict_e.xml, JMdict_e_examp.xml,and JMnedict.xml - are the property of the Electronic Dictionary Research and Development Group, and are used in conformance with the Group's license. Project started in 1991 by Jim Breen.
All derived files are distributed under the same license, as the original license requires it.
The original kanjidic2.xml file is released under Creative Commons Attribution-ShareAlike License v4.0. See the Copyright and Permissions section on the Kanjidic wiki for details.
All derived files are distributed under the same license, as the original license requires it.
The RADKFILE and KRADFILE files are copyright and available under the EDRDG Licence. The copyright of the RADKFILE2 and KRADFILE2 files is held by Jim Rose.
NPM packages @scriptin/jmdict-simplified-types
and
@scriptin/jmdict-simplified-loader
are available under MIT license.
The source code and other files of this project, excluding the files and packages mentioned above, are available under Creative Commons Attribution-ShareAlike License v4.0. See LICENSE.txt