obynio / anki-japanese-furigana

Anki add-on providing support for adding furigana on Japanese text
https://ankiweb.net/shared/info/678316993
GNU General Public License v3.0
17 stars 5 forks source link

Error when generating readings including ヶ月 #27

Closed ahlec closed 1 year ago

ahlec commented 1 year ago

Example sentence: 彼はトルコを2ヶ月間訪問するつもりです Example sentence: 彼はトルコを2ヵ月間訪問するつもりです

When you attempt to generate furigana for the above sentences, it will give you a crash about how the regular expression didn't find results. Looking at the output from Mecab, the above sentence is:

彼[カレ] は[ハ] ▦[] トルコ[トルコ] ▦[] を[ヲ] 2[] ヵ月[カゲツ] 間[カン] 訪問[ホウモン] する[スル] つもり[ツモリ] です[デス]

For the ヶ月 portion, the generated regular expression out of kanjiToRegex is ^ゖ(.+?)$, which is correct not to match against the reading that Mecab is giving (かげつ).

This bug happens when a small kana character has a non-small reading. As such, this applies to both ヶ月 and ヵ月, though this might apply to other words as well (none come to mind, but that doesn't mean they don't exist).

c0eos commented 10 months ago

I encountered an issue with this fix: it doesn't handle 戦場ヶ原 (name read Senjougahara), when ヶ is read が rather than か.