Open sschmidTU opened 3 years ago
To still match (rtk) keywords while predicting WK radicals, the predicted radical can be an additional query.
For example, foreh.*
(regex) can be predicted as WK's forehead radical (crown in RTK),
but we should still also find 額 forehead (RTK, in WK amount).
Often this is unambiguous though, like tsun.*
being tsunami,
but maybe we should still add it as an additional query to be safe,
it will be hard to check what part of each radical will be unique across the whole dataset.
Of course this wouldn't be necessary if we had the whole data annotated with WK radicals directly.
in RTK mode, partial input is no problem ("recl" will usually lead to the same result as "reclining"), but WK-specific radicals are currently only replaced when typed in full, and only then they give the desired result (usually), so "triceratop" will not find anything, but "triceratops" will.
-> build a system where each radical has a minimal matching regex, e.g.
tric[a-z]*
for triceratops.