Closed rsimmons closed 3 years ago
Thank you for asking about Sudachi Normalization.
すみません could be 済みません , 住みません or 澄みません. Sudachi does not normalize a word to any particular one if there is a possibility of other words.
すみ 動詞,一般,*,*,五段-マ行,連用形-一般 すむ
ませ 助動詞,*,*,*,助動詞-マス,未然形-一般 ます
ん 助動詞,*,*,*,助動詞-ヌ,終止形-撥音便 ず
Therefore, すみ(動詞,一般,,,五段-マ行,連用形-一般)is normalized to すむ is the correct behavior.
In the same way, すい(動詞,一般,,,五段-マ行,連用形-イ音便) should be normalized to すむ.
すい 動詞,一般,*,*,五段-マ行,連用形-イ音便 済む
ませ 助動詞,*,*,*,助動詞-マス,未然形-一般 ます
ん 助動詞,*,*,*,助動詞-ヌ,終止形-撥音便 ず
We will fix it in the next update.
Is this the right place to report linguistic issues with the dictionaries? Apologies if not.
Using Sudachi 0.4.3, the core dictionary version 20200722, and mode C, I noticed that すみません and すいません do not normalize to the same verb, and it seems like they should.
For すいません, the normalized verb is 済む, which seems correct:
For すみません, the normalized verb is すむ. It seems like it should be 済む also?