messense / jieba-rs

The Jieba Chinese Word Segmentation Implemented in Rust
MIT License
738 stars 46 forks source link

add_word:2个字的分词无效,3个字就是正常的 #91

Closed ringrid closed 1 year ago

ringrid commented 1 year ago
let mut jieba = Jieba::new();
jieba.add_word("莞城", None, None);
let s1 = "广东省东莞市莞城区";
jieba.cut(s, true);

返回["广东省", "东莞市", "莞", "城区"],无法识别“莞城”

但把word设成“莞城区”就是正常的

另外想问一下,load_dict加载的文件中的词频和词性是否可省略,我试验只放词语,好像不起作用。

ringrid commented 1 year ago

我调高了词频,就正常了,问题关闭