FooSoft / yomichan

Japanese pop-up dictionary extension for Chrome and Firefox.
1.04k stars 203 forks source link

Kanji with kanji dictionary entries and no term dictionary entries do not show up on the Yomichan search page #2211

Open MarvNC opened 1 year ago

MarvNC commented 1 year ago

Description Some kanji might only have an entry in a kanji dictionary and not in any term dictionaries, in which case they may not be shown in the Yomichan search page. For example, 幷 can be hovered normally using Yomichan but it does not show when searched in the search page.

Below, with KANJIDIC installed and no term dictionaries containing the kanji.

Hover: chrome_Google_-_httpswww google com_-_Google_Chrome_2022-08-19_14-37-06

Search page: chrome_幷_-_Yomichan_Search_-_Google_Chrome_2022-08-19_14-37-23

Browser version Chrome Version 104.0.5112.102 (Official Build) (64-bit)

Yomichan version Yomichan

Exported settings file

MarvNC commented 1 year ago

In additional, some kanji do not scan at all even with a valid kanji dictionary entry for it. For example many of the kanji in this file, which are all included in this dictionary. Maybe there is some logic limiting what counts as a kanji?

toasted-nutbread commented 1 year ago

By default, when you search on the search page, it does not search for kanji-only definitions. The URL will look something like this: chrome-extension://.../search.html?query=幷 or chrome-extension://.../search.html?query=幷&type=terms&wildcards=off which has no results.

Results will come up if the URL is: chrome-extension://.../search.html?query=幷&type=kanji -------------------------------------------^type=kanji (note this part)

This is obviously sub-optimal in terms of actual user workflow, but this is the reason why this happens.

For your second comment, are you able to point out a specific kanji that has the issue? I am assuming that when you say "scan", you mean regular scanning outside of the search page. I am able to scan many of the kanji on that file without issue.

MarvNC commented 1 year ago

Ah, I didn't know about the search page query.

Yes, here are some examples: 𠈓 𠡍 𡉴 that do not scan even with an entry in an installed dictionary. You can see it is rendered with a different font, which makes me suspect it is out of some range. It's listed in JIS level 4 so it's probably Japanese, and other kanji in that level do scan correctly.

Every single kanji in that file is included in the dictionary that is based on it I linked above.

Code_●_kanji_bank_1 json_-_Visual_Studio_Code_2022-08-19_17-04-32

toasted-nutbread commented 1 year ago

I suspect this has something to do with UTF-16 surrogate pairs. That is, the kanji you posted are technically represented as two "characters" which are rendered as a single glyph, so this may be causing some issues.