obynio / anki-japanese-furigana

Anki add-on providing support for adding furigana on Japanese text
https://ankiweb.net/shared/info/678316993
GNU General Public License v3.0
17 stars 5 forks source link

Some "numbers" having readings removed aren't actually numbers #25

Open ahlec opened 1 year ago

ahlec commented 1 year ago

We have the config option to remove readings from numbers, which we're currently doing by removing readings associated with 一二三四...

However, not all occurrences of those characters are outright numbers. Example: 一通り (ひととおり) uses 一, but shouldn't have its reading removed because it's part of a phrase.

A potential first step could be that we only remove readings from numbers where the character is 一 and the reading is いち, いっ, etc. But that might not work long-term. In order to fully fix this, we might need a separate tool that has a database/dictionary lookup to determine if a word is a number/number + counter (we want to remove the reading), or if it's a regular word (we want to keep the reading).

EDIT: Interestingly, 一先ず doesn't remove the reading from the 一. So clearly this problem isn't 100% universal even currently.

Actual: 一通りの一先ず一通[とおり]の一先[ひとま]ず Expected: 一通りの一先ず一通[ひととおり]の一先[ひとま]ず

ahlec commented 1 year ago

Another test case to include in this would be 一切, which would also challenge the earlier proposal of "only remove it if the reading is a regular one (here, いっ)."