infinilabs / analysis-pinyin

🛵 This Pinyin Analysis plugin is used to do conversion between Chinese characters and Pinyin.
Apache License 2.0
2.94k stars 547 forks source link

求助,使用match_phrase搜索不到结果 #285

Open LiuFqiang opened 1 year ago

LiuFqiang commented 1 year ago

查看分词结果 GET pinyin_test/_analyze { "field": "name.pinyin", "text": ["刘德华"] }

{ "tokens" : [ { "token" : "liu", "start_offset" : 0, "end_offset" : 0, "type" : "word", "position" : 0 }, { "token" : "ldh", "start_offset" : 0, "end_offset" : 0, "type" : "word", "position" : 0 }, { "token" : "de", "start_offset" : 0, "end_offset" : 0, "type" : "word", "position" : 1 }, { "token" : "hua", "start_offset" : 0, "end_offset" : 0, "type" : "word", "position" : 2 } ] } 含有ldh,但是使用match_parse却搜索不到 GET medcl/_search { "query": { "match_phrase": { "name.pinyin": "ldh" } } }

{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] } }

xiaoshi2013 commented 6 months ago

用 POST medcl/_search { "query": { "match_phrase": { "name.pinyin": "liudehua" } } } 必须每个 position 都有和 刘德华 分词结果匹配的token

ldh 分词后 相同 position 会有不匹配的token

RealMuDao commented 3 months ago

我也是这个问题,版本是8.12.2。分词结果的star_offset和end_offset全都是0

RealMuDao commented 3 months ago

需要怎么配置才能正确获取到star_offset和end_offset

RealMuDao commented 3 months ago

还是说这个版本有问题?因为readme的示例展示结果是有的。