elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.23k stars 24.85k forks source link

`phase_slop` question #15174

Closed feifeiiiiiiiiiii closed 8 years ago

feifeiiiiiiiiiii commented 8 years ago

I add phase_slop in query_string, the query dsl as follow:

{
   "query": {
      "bool": {
         "must": [
           {
             "term": {
               "platform": 0
             }
           },
            {
               "query_string": {
                  "default_field": "text",

                  "query": "\"美丽的蓝色多瑙河\"",
                 "phrase_slop": 1
               }
            }
         ]
      }
   },
  "highlight": {
    "fields": {
      "text": {
        "fragment_size": 150
      }
    }
  }
}

But I get shard failures result as follows:

{
    "took": 528,
    "timed_out": false,
    "_shards": {
        "total": 30,
        "successful": 29,
        "failed": 1,
        "failures": [
            {
                "index": "listening_v2_201511",
                "shard": 7,
                "status": 500,
                "reason": "RemoteTransportException[[es13][inet[/10.10.3.98:9300]][indices:data/read/search[phase/fetch/id]]]; nested: FetchPhaseExecutionException[[listening_v2_201511][7]: query[filtered(+platform:[0 TO 0] +(text:\"美丽 丽 蓝色 蓝 色 多瑙河 河\"~1))->cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter@b5b6c454)],from[0],size[10]: Fetch Failed [Failed to highlight field [text]]]; nested: StringIndexOutOfBoundsException[String index out of range: -1]; "
            }
        ]
    },
    "hits": {
        "total": 117,
        "max_score": 87.00564
    ...

How solve it?

clintongormley commented 8 years ago

This works for me in 1.7.0 and 2.1.0:

PUT t/t/1
{
  "text": "\"美丽的蓝色多瑙河\"",
  "platform": 0
}

GET _search
{
   "query": {
      "bool": {
         "must": [
           {
             "term": {
               "platform": 0
             }
           },
            {
               "query_string": {
                  "default_field": "text",

                  "query": "\"美丽的蓝色多瑙河\"",
                 "phrase_slop": 1
               }
            }
         ]
      }
   },
  "highlight": {
    "fields": {
      "text": {
        "fragment_size": 150
      }
    }
  }
}

Please could you provide a full recreation of the problem, plus the stack trace that you're seeing in the logs, and the version of ES that you're using.

feifeiiiiiiiiiii commented 8 years ago

ES version:

version: {
    number: "1.5.2",
    build_hash: "62ff9868b4c8a0c45860bebb259e21980778ab1c",
    build_timestamp: "2015-04-27T09:21:06Z",
    build_snapshot: false,
    lucene_version: "4.10.4"
}

Analyzer:

use chinese IK(modify source)

part mapping as follow:

{
   text: {
      mapping: {
          searchAnalyzer: "ik",
          indexAnalyzer: "ik",
          include_in_all: false,
          omit_norms: true,
          store: "no",
          norms: {
              enabled: false
          },
          term_vector: "with_positions_offsets",
          type: "string"
     },
     path_match: "text"
  }
}
clintongormley commented 8 years ago

I've tried the following on 1.5.2:

PUT t
{
  "mappings": {
    "t": {
      "properties": {
        "text": {
          "omit_norms": true,
          "store": "no",
          "norms": {
            "enabled": false
          },
          "term_vector": "with_positions_offsets",
          "type": "string"
        }
      }
    }
  }
}

PUT t/t/1
{
  "text": "\"美丽的蓝色多瑙河\"",
  "platform": 0
}

GET _search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "platform": 0
          }
        },
        {
          "query_string": {
            "default_field": "text",
            "query": "\"美丽的蓝色多瑙河\"",
            "phrase_slop": 1
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "text": {
        "fragment_size": 150
      }
    }
  }
}

And it works correctly. I haven't tried with the ik analyzer as it is not supported by Elasticsearch. I suggest trying to recreate without the ik analyzer. If you can do that, then reopen this issue here, otherwise open an issue on https://github.com/medcl/elasticsearch-analysis-ik