sschmidTU / mr-kanji-search-wtk

WTK-Search is a Kanji search engine using (multiple) Wanikani radicals or RTK names, on a RTK element dataset of 3000+ Kanji
https://sschmidtu.github.io/mr-kanji-search-wtk/
7 stars 2 forks source link

Rewrite elements data so that subelements/synonyms aren't copied for every kanji #24

Closed sschmidTU closed 1 month ago

sschmidTU commented 3 years ago

The kanji 土 ('soil') has one element, which can be called soil, dirt or ground, as synonyms. Right now, every kanji that contains this element, e.g. 吐 'spit' has all the synonyms copied: elements: spit, mouth, soil, dirt, ground

The data would be much cleaner and easier to work with if synonyms are centralized, and only added by one standard name in the data, for 吐: elementsPure: mouth, soil

This would also make it much easier to output which elements a kanji has.

Unfortunately, this will require 'refactoring' the element data for the existing 3060 or so kanji, turning elements into elementsPure. But maybe we can add a little bit of clever automation to that.

sschmidTU commented 3 years ago

Above commit 1e4fca0596da442bbf6b70518216c83eb5b5a159 is a start for this issue: It adds a new field elementsPure to some kanji, which just names the main (biggest) elements, not all the subelements. example 緻:

elements: thread, spiderman, climax, wall, one, ceiling, elbow, soil, dirt, ground, taskmaster // old way
elementsPure: thread, climax, taskmaster // new way

sometimes the "elementsPure" aren't so clear-cut, as in 羞, so i added a flag elementsPureVague:

elements: sheep, horns, king, stick, celery, five, salad, flowers
elementsPure: sheep, celery // celery would have all lines going over the edge to the right
elementsPureVague: yes
sschmidTU commented 3 years ago

New kanji can now already be added just by giving them their main elements, elementsPure, no need to list all subelements with elements anymore, at least if the elements are already in elements_data.txt/elementsDict.js (not yet complete). (node _tools/elementsDataToJson.js creates assets/js/elementsDict.js from elements_data.txt)

So, for 當 (kanji 3071), I just added this: elementsPure: schoolhouse, mouth, rice field instead of this: elements: schoolhouse, owl, one, animal legs, crown, mouth, rice field, brains

sschmidTU commented 2 years ago

Will be fixed by #32

sschmidTU commented 1 month ago

Done (#32, #25, etc)