infinilabs / analysis-pinyin

🛵 This Pinyin Analysis plugin is used to do conversion between Chinese characters and Pinyin.
Apache License 2.0
2.96k stars 548 forks source link

ES 5.6 高亮问题 #222

Open zzt93 opened 4 years ago

zzt93 commented 4 years ago

你好,请问高亮结果不对如何处理?使用的默认配置

image

medcl commented 4 years ago

包含setting和数据的完整的例子能发一下么? 另外,高亮可以选其它字段。

zzt93 commented 4 years ago

版本:

curl localhost:9200/ { "name" : "SHDPaA2", "cluster_name" : "searcher-dev", "cluster_uuid" : "0AeHWcY6QF6idSlOEiuHTA", "version" : { "number" : "5.6.1", "build_hash" : "667b497", "build_date" : "2017-09-14T19:22:05.189Z", "build_snapshot" : false, "lucene_version" : "6.6.1" }, "tagline" : "You Know, for Search" }

例子:我拷贝了出问题的index的settings和mapping,但是无法复现这个问题,感觉很迷。

PUT /asdf/
{
  "mappings": {
    "asdf": {
    "properties": {
      "authStatus": {
        "type": "byte"
      },
      "email": {
        "type": "keyword"
      },
      "infoPublic": {
        "type": "short"
      },
      "mobile": {
        "type": "keyword"
      },
      "mold": {
        "type": "long"
      },
      "personalAffairId": {
        "type": "long"
      },
      "publicType": {
        "type": "byte"
      },
      "superId": {
        "type": "keyword"
      },
      "tags": {
        "type": "keyword"
      },
      "unionId": {
        "type": "keyword"
      },
      "username": {
        "type": "keyword",
        "fields": {
          "pinyin": {
            "type": "text",
            "analyzer": "pinyin"
          }
        }
      }
    }
    }
  },
  "settings": {
    "index": {
      "refresh_interval": "1s",
      "number_of_shards": "10",
      "store": {
        "type": "fs"
      },
      "number_of_replicas": "0"
    }
  }
}

除了mappings和settings之外,还有其他可能会有影响的配置吗?

zzt93 commented 4 years ago

另外,“高亮可以选择其他字段”是什么意思?

medcl commented 4 years ago

1.如果拼音分词的结果是多个位置叠加的 term,那么查询条件不应该同时命中这些term,所以这个字段的 search_analyzer 可以设置为 keyword 比较好; 2.命中和高亮的逻辑可以分开,使用另外一个字段,查询范围可以窄一点和准一点,高亮的时候用这个字段。