atilika / kuromoji

Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Apache License 2.0
950 stars 131 forks source link

ソーシャルメディア is not tokenized into two words #137

Open hohno-panopto opened 3 years ago

hohno-panopto commented 3 years ago

I hit an issue where this term is not tokenized 'social' and 'media'. Is this because these two words are not in the corpus and will it be resolved in the future release when those words are added?