wikimedia / search-highlighter

Github mirror of "search/highlighter" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing
100 stars 36 forks source link

Highlighting not working for * field #9

Closed sschuerz closed 9 years ago

sschuerz commented 9 years ago

If I try this query:

GET test_index/text/_search
{
  "query": {
    "match": {
      "_all": {
        "query": "test"
      }
    }
  },
  "highlight": {
    "fields": {
      "*": {
        "type": "experimental"
      }
    }
  }
}

I get the following error: FetchPhaseExecutionException[[test_index][1]: query[filtered(_all:test)->cache(_type:test)],from[0],size[5]: Fetch Failed [Failed to highlight field [_size]]]; nested: NumberFormatException[For input string: \"\"];

I only have string fields in my mapping. And I don't use a field called "_size".

If I explicitely specify the fields for highlighting (instead of "*"), it works. However, this is not really an option for me, since the query should be agnostic to new fields in the mapping.

search-highlighter version: 1.4.0

sschuerz commented 9 years ago

Has nobody else experienced this issue? I would really need a solution for this, especially since I'm facing problems with the built-in highlighters in elasticsearch. See here: elasticsearch/elasticsearch#8468

Here a full minimal example to reproduce the bug:

PUT test_index

POST test_index/test_type/
{
  "foo": "foo_value_1",
  "bar": "bar_value_1"
}

GET test_index/test_type/_search
{
  "query": {
    "match": {
      "_all": {
        "query": "foo_value_1"
      }
    }
  },
  "highlight": {
    "fields": {
      "*": {
        "type": "experimental"
      }
    }
  }
}

Here the full elasticsearch response:

{
   "took": 8,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 4,
      "failed": 1,
      "failures": [
         {
            "index": "test_index",
            "shard": 4,
            "status": 500,
            "reason": "FetchPhaseExecutionException[[test_index][4]: query[filtered(_all:foo_value_1)->cache(_type:test_type)],from[0],size[10]: Fetch Failed [Failed to highlight field [_size]]]; nested: NumberFormatException[For input string: \"\"]; "
         }
      ]
   },
   "hits": {
      "total": 1,
      "max_score": 0.19178301,
      "hits": []
   }
}

Could anyone please look into this? Or does anyone know a workaround (without specifying all fields explicitly)?

search-highlighter version: 1.4.0 elasticsearch version: 1.4.2

nik9000 commented 9 years ago

Hey sorry for not responding to this. It looks like I wasn't subscribed to my own project..... I've recreated the problem like this:

curl -XPUT localhost:9200/test_index

curl -XPOST localhost:9200/test_index/test_type?pretty -d'
{
  "foo": "foo_value_1",
  "bar": "bar_value_1"
}'

curl -XPOST localhost:9200/test_index/test_type/_search?pretty -d'
{
  "query": {
    "match": {
      "_all": {
        "query": "foo_value_1"
      }
    }
  },
  "highlight": {
    "fields": {
      "*": {
        "type": "experimental"
      }
    }
  }
}'

I'll see if I can fix it. I don't have a ton of time but I might be able to do it.

nik9000 commented 9 years ago

@sschuerz - sorry for the super late response. I've got the fix posted for code review at the foundation and we should get it merged within a day or so. Its reasonably simple for me to cut another release of the plugin once that is done.

sschuerz commented 9 years ago

Thank you very much!

In the meantime I switched to the built-in fast vector highlighter, accepting some issues with it (indexing with positions and offset; explicitly specifying all the fields in the mapping to index them with positions and offsets).

But if there is a new release of the experimental highlighter, I'll try using it again.

nik9000 commented 9 years ago

I've just about finished the release. It should be synced to central in an hour or so.