Open jiankunking opened 4 months ago
ik无法按照main.dic字典分词,比如创立,已经在词典了,但ik_smart的时候分不出来
POST _analyze { "analyzer": "ik_smart", "text": "什么时候创立了公司?" }
分词结果
{ "tokens" : [ { "token" : "什么时候", "start_offset" : 0, "end_offset" : 4, "type" : "CN_WORD", "position" : 0 }, { "token" : "创", "start_offset" : 4, "end_offset" : 5, "type" : "CN_CHAR", "position" : 1 }, { "token" : "立了", "start_offset" : 5, "end_offset" : 7, "type" : "CN_WORD", "position" : 2 }, { "token" : "公司", "start_offset" : 7, "end_offset" : 9, "type" : "CN_WORD", "position" : 3 } ] }
了 是停止词,不知道为啥会分出 "立了"
{ "tokens" : [ { "token" : "什么时候", "start_offset" : 0, "end_offset" : 4, "type" : "CN_WORD", "position" : 0 }, { "token" : "创立", "start_offset" : 4, "end_offset" : 6, "type" : "CN_CHAR", "position" : 1 }, { "token" : "公司", "start_offset" : 7, "end_offset" : 9, "type" : "CN_WORD", "position" : 3 } ] }
改成ik_max_word模式吧
Description
ik无法按照main.dic字典分词,比如创立,已经在词典了,但ik_smart的时候分不出来
Steps to reproduce
分词结果
了 是停止词,不知道为啥会分出 "立了"
Expected behavior
Environment