atilika / kuromoji

Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Apache License 2.0
950 stars 131 forks source link

Why does tokenized Kanji features never contains Hiragana ? #119

Closed theGlenn closed 6 years ago

theGlenn commented 6 years ago

I was wondering how come that this 寿司 produces these 名詞,一般,*,*,*,*,寿司,スシ,スシ features and doesn't include すし

cmoen commented 6 years ago

Readings are written in katakana (スシ), which can be converted to hiragana (すし) if if that's better for your use-case.