shunk031 / paper-survey

📚 Survey of previous research and related works on machine learning (especially Deep Learning) in Japanese
https://shunk031.github.io/paper-survey/
151 stars 12 forks source link

Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean? #132

Closed shunk031 closed 6 years ago

shunk031 commented 7 years ago

https://arxiv.org/abs/1708.02657

shunk031 commented 6 years ago

中国語・日本語・韓国語においてテキストのエンコーディング単位(UTF-8 bytes,文字,単語,ローマ字化した単語)が文書分類でそれぞれどのような効果を出すか実証的に比較を行った.

paper summary