thunlp / SE-WRL

Improved Word Representation Learning with Sememes
MIT License
195 stars 56 forks source link

数据问题 #7

Closed chuangfortytwo closed 6 years ago

chuangfortytwo commented 6 years ago

你好,请问sougou语料中分好的词是用什么进行分词的?

heyLinsir commented 6 years ago

数据预处理不是我做的,不过推荐使用thulac分词器。https://github.com/thunlp/THULAC

chuangfortytwo commented 6 years ago

好的,多谢