himselfv / wakan

Japanese and Chinese learning tool with dictionary
38 stars 7 forks source link

Duplicate kanji index entries in vocabularies #219

Open himselfv opened 10 years ago

himselfv commented 10 years ago

Original report by me.

User vocabulary package has a table for kanji index, to speed up locating words which contain a given kanji. Every word is listed there as many times as many kanji it has. This is used by KanjiCompounds when listing vocabulary words.

Words which have two instances of the same kanji (e.g. ichi-ichi) are listed twice, and displayed twice in KanjiCompounds.

Either indexing routine needs to be fixed to only list such words once for a single kanji + patch written to update existing vocabularies (preferable), or KanjiCompounds should de-duplicate results before displayin them. Or maybe both.

Or maybe custom vocabulary format just needs to be scrapped and standard dictionary format used instead! (+convertor written for existing vocabs of course). At least it's one problem instead of two.