migaku-official / Migaku-Kanji-Addon

Learn kanji within the context of the vocab in your Anki collection. Comes with a powerful lookup browser.
https://migaku.io
GNU General Public License v3.0
51 stars 12 forks source link

[FEATURE] Ability to group linked vocab by reading #159

Open saxoncameron opened 2 years ago

saxoncameron commented 2 years ago

Discord user ガリン#4920 requests: https://discord.com/channels/752293144917180496/846925957302714388/936149985043550248

is grouping by reading on the roadmap ? I want this feature so hard lol would be a game changer

Note there is a subsequent Discord discussion thread in the above link.

Further context, taken from thread:

行 is in so many words and the ぎょう reading is kinda rare so I have to look through all of those words to see which other word has the reading with ぎょう image

Which I think is a valid and sound request.

Proposed implementation

I think we should add a radio button toggle adjacent to the "Vocab" heading, that toggles between two states: 1) Sorting vocab words by frequency (default), and 2) Grouping vocab words and by "vocab words" I'm referring to the "solid colour" buttons within the "Vocab" subheading. I don't think we should bother with the "upcoming" (outlined) or "example" (grey) word buttons.

Perhaps it would be nice to have an option in Kanji Settings, so user's could change the default sorting behaviour (for those who prefer group-by-reading as a default).

Technical challenges, considerations

RicBent commented 2 years ago

I'm not sure if having the vocab words sorted by reading time is that useful in many cases. For the case where you want to see a word with a specific reading I propose this:

We could make the readings on the left side clickable. Whenever you either click a single one or a reading group (like onyomi) it would highlight all matching ones. This also gives you a good quick overview on how frequent specific reading (groups) are.

As for matching I'd do this: First use furigana distribution to find the reading part for the target kanji. 連濁 shouldn't be fairly easy to handle. If the automatic distribution doesn't work because it is ambiguous, just search for the target reading in the reading of the entire word. More heuristics like checking if it is at the start/end/middle of the word should get the accuracy of this high enough.

Especially when grouping, it could get a bit awkward regarding handling words where the reading cannot be identified correctly. This solution should also fix that.