WorksApplications / SudachiDict

A lexicon for Sudachi
233 stars 19 forks source link

快い appears to have incorrect normalized form #32

Open rsimmons opened 3 years ago

rsimmons commented 3 years ago

With the latest core dictionary:

$ echo 快い | java -jar sudachi-0.5.2.jar -m B -a
快い  形容詞,非自立可能,*,*,形容詞,終止形-一般    良い  快い  イイ  0   []  
EOS

Note that the normalized form is 良い, not 快い as expected.

But this seems correct:

$ echo 快くない | java -jar sudachi-0.5.2.jar -m B -a
快く  形容詞,一般,*,*,形容詞,連用形-一般   快い  快い  ココロヨク   0   []  
ない  形容詞,非自立可能,*,*,形容詞,終止形-一般    無い  ない  ナイ  0   []  
EOS
kawahara-n commented 3 years ago

Thank you for your report.

These are registerd as different words.

快い 形容詞,一般, ココロヨイ 快い 形容詞,非自立可能, ヨイ

But in this case, it is supposed to be analyzed as 形容詞,一般. It seems that the cause is insufficient learning. We will modify it, because 快い(ヨイ) is not general notation.

As a side note, if you add a punctuation mark, it will be analyzed as ココロヨイ.

快い    形容詞,一般,*,*,形容詞,終止形-一般    快い    快い    ココロヨイ    0    []    
。    補助記号,句点,*,*,*,*    。    。    。    0    []    
EOS