wyymichael / paoding

Automatically exported from code.google.com/p/paoding
0 stars 0 forks source link

在 base.dic 有一个不在 gbk 中的编码 #28

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago

path svn/trunk/ paoding-analysis-1/ dic/ CJK/ base.dic

line 42197

"南蛮" = b'\xc4\xcf\xc2\xf9'

"南蛮" = b'\xc4\xcf\xc2\xf9\xf8\xc9\r\n'

b'\xf8\xc9' 是什么字呢?

Original issue reported on code.google.com by Jack...@gmail.com on 6 Dec 2008 at 2:39

GoogleCodeExporter commented 9 years ago
应是编辑错误导致的无效字符;
同时它也应不影响分词效果。

Original comment by qieqie.wang on 8 Dec 2008 at 1:28