dotSlashLu / nodescws

scws(Simple Chinese Word Split) node.js binding - scws中文分词node.js模块
13 stars 5 forks source link

libscws not treating [nostats] rule correctly #1

Closed dotSlashLu closed 10 years ago

dotSlashLu commented 10 years ago

Adding stop words to [nostats] rule doesn't affect the result at all.

dotSlashLu commented 10 years ago

https://github.com/dotSlashLu/nodescws/blob/master/libscws/rule.c#L295 this line also fails to correctly report the repeat words, scinario is, if I repeatedly add "日本" in 2 contiguous line and "是" in the third, it will report "是" is repeating, i.e.:

57 日本
58 日本
59 是