tshatrov / ichiran

Linguistic tools for texts in Japanese language
MIT License
285 stars 30 forks source link

てもいい / でもいい dropping も out of data #40

Closed molesquirrel closed 6 months ago

molesquirrel commented 9 months ago

Tested on ichi.moe, but the same thing happens on the cli version.

Input: やってもいい The kana/romaji is correct, but the breakup (compound) only shows やって and いい. Interestingly, the definition given for いい is "it's ok if ... / is it ok if ...?", which suggests that at some layer, the system is looking at the てもいい structure.

Link: https://ichi.moe/cl/word/?q=%E3%82%84%E3%81%A3%E3%81%A6%E3%82%82%E3%81%84%E3%81%84

tshatrov commented 9 months ago

The compound parts are not guaranteed to sum up to the entire word (in some cases that's actually impossible). Both いい and もいい are mapped to the same suffix いい and they have the same meaning. I think this was caused by jmdict not having a separate entry for もいい so I just used いい for both. I don't think it's a big issue because it still shows the correct meaning.

tshatrov commented 5 months ago

While experimenting with suffixes I decided to add もいい as a separate entry so now this is fixed.