migaku-official / Migaku-Kanji-Addon

Learn kanji within the context of the vocab in your Anki collection. Comes with a powerful lookup browser.
https://migaku.io
GNU General Public License v3.0
55 stars 12 forks source link

[FEATURE] Multilanguage Heisig data #213

Open vincentbohlen opened 3 weeks ago

vincentbohlen commented 3 weeks ago

I am not a native English speaker. Most of the Kanji learning material and advice available on the Internet is in English or is referencing English material. While I have no problem understanding English, when applying the RTK method, I run into trouble with Heisig's sometimes outlandish choice of keywords, or the use of synonyms making it important to memorize nuances. This is already difficult in one's native language but even more difficult if the nuances are not internalized for second language. Since a translated and adapted version of RTK is available in different languages, I prefer using the keywords used in the version published in my native language.

I played around with how to use the Kanji GOD add on in German and while it would be a possible solution to enter the translations into the custom keyword / custom primitive column, it would be a lot of manual work. I had an almost complete list of keywords and primitives on my PC, so I wrote some basic SQL to alter my local Kanji.db and overwrote the English keywords/primitives with the German ones. This works great for me. I am now thinking about also adding Heisig's stories and comments, but I personally don't necessarily need them anymore. I assume that there are other Japanese learners who would benefit from having a version of Kanji GOD which aligns with the localized version of RTK they might be using. This is not about customizing the data but providing the "official" localized set of keywords as a different base set.

Providing the data for different languages would be a one time effort. DB could manually be replaced by user but language selection with data load may be the nicer solution. Migaku already allows for the selection of dictionaries for different languages removing the mental work of translating from English. Adjusting Kanji GOD data would make for a seamless experience.

mjuhanne commented 3 weeks ago

@vincentbohlen Actually the groundwork for this is already done. There is a fork of Kanji GOD (https://github.com/mjuhanne/Migaku-Kanji-Addon/tree/test_storydb) which contains bunch of stuff improvements that I haven't yet tried to merge into the main branch.

One of the improvements is Story DB: It takes the stories (Heisig, Koohi) from Kanji DB into a separate Story DB. In this database each row contains a set of data for each kanji (source name, keyword, story, primitives). The source here refers to Heisig / Koohi / RRTK / Wanikani / "crowd-sourced". RRTK and Wanikani data is gathered from a couple of Anki decks and the crowd-sourced stuff is a mixture of best-of-the-best of Koohi stories and keywords (manually checked so they don't conflict with Heisig ones), in addition to some of my own mnemonics and keywords.

What you (and maybe other users for other languages) would like to create is another "source" into Story DB (for example heisig_de for german Heisig keywords). The process would be 1) create a tab-separated file, in which each row would consist of source name + kanji + keywords 2) merge those changes into Story DB with a separate Python script

If you'd like to try this approach, let me know and I can walk you through it.

The test branch contains bunch of other improvements so you might want to take a look at it anyway:

I've used it like this for the past 8 months or so now so it should be farely stable. If you want to try it, make sure you use the right branch and DON'T FORGET TO BACKUP your previous Kanji GOD directory and Anki decks :)

Here's some screenshots with the current status:

Heisig, RRTK and Wanikani sources:

Screenshot 2024-08-21 at 21 47 21 Screenshot 2024-08-21 at 21 47 31

Koohi and Crowd-sourced:

Screenshot 2024-08-21 at 21 43 37 Screenshot 2024-08-21 at 21 43 59

Edit mode(here editing the Crowd-sources primitives list)

Screenshot 2024-08-21 at 22 23 09