Closed jaksmid closed 8 years ago
@jaksmid could you provide some documents and the stack trace that is produced when you see this exception please?
@jpountz given that this only happens with size
> 0, I'm wondering if this highlighting trying to highlight the geo field? Perhaps with no documents on a particular shard?
/cc @nknize
I can reproduce something that looks just like this with a lucene test if you apply the patch on https://issues.apache.org/jira/browse/LUCENE-7185
I suspect it may happen with extreme values such as latitude = 90 or longitude = 180 which are used much more in tests with the patch. See seed:
[junit4] Suite: org.apache.lucene.spatial.geopoint.search.TestGeoPointQuery
[junit4] IGNOR/A 0.01s J1 | TestGeoPointQuery.testRandomBig
[junit4] > Assumption #1: 'nightly' test group is disabled (@Nightly())
[junit4] IGNOR/A 0.00s J1 | TestGeoPointQuery.testRandomDistanceHuge
[junit4] > Assumption #1: 'nightly' test group is disabled (@Nightly())
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestGeoPointQuery -Dtests.method=testAllLonEqual -Dtests.seed=4ABB96AB44F4796E -Dtests.locale=id-ID -Dtests.timezone=Pacific/Fakaofo -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
[junit4] ERROR 0.35s J1 | TestGeoPointQuery.testAllLonEqual <<<
[junit4] > Throwable #1: java.lang.IllegalArgumentException: Illegal shift value, must be 32..63; got shift=0
[junit4] > at __randomizedtesting.SeedInfo.seed([4ABB96AB44F4796E:DBB16756B45E397A]:0)
[junit4] > at org.apache.lucene.spatial.util.GeoEncodingUtils.geoCodedToPrefixCodedBytes(GeoEncodingUtils.java:109)
[junit4] > at org.apache.lucene.spatial.util.GeoEncodingUtils.geoCodedToPrefixCoded(GeoEncodingUtils.java:89)
[junit4] > at org.apache.lucene.spatial.geopoint.search.GeoPointPrefixTermsEnum$Range.fillBytesRef(GeoPointPrefixTermsEnum.java:236)
[junit4] > at org.apache.lucene.spatial.geopoint.search.GeoPointTermsEnum.nextRange(GeoPointTermsEnum.java:71)
[junit4] > at org.apache.lucene.spatial.geopoint.search.GeoPointPrefixTermsEnum.nextRange(GeoPointPrefixTermsEnum.java:171)
[junit4] > at org.apache.lucene.spatial.geopoint.search.GeoPointPrefixTermsEnum.nextSeekTerm(GeoPointPrefixTermsEnum.java:190)
[junit4] > at org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:212)
[junit4] > at org.apache.lucene.spatial.geopoint.search.GeoPointTermQueryConstantScoreWrapper$1.scorer(GeoPointTermQueryConstantScoreWrapper.java:110)
[junit4] > at org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
[junit4] > at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:644)
[junit4] > at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:68)
[junit4] > at org.apache.lucene.search.BooleanWeight.optionalBulkScorer(BooleanWeight.java:231)
[junit4] > at org.apache.lucene.search.BooleanWeight.booleanScorer(BooleanWeight.java:297)
[junit4] > at org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:364)
[junit4] > at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:644)
[junit4] > at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:68)
[junit4] > at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:68)
[junit4] > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:666)
[junit4] > at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:91)
[junit4] > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:473)
[junit4] > at org.apache.lucene.spatial.util.BaseGeoPointTestCase.verifyRandomRectangles(BaseGeoPointTestCase.java:835)
[junit4] > at org.apache.lucene.spatial.util.BaseGeoPointTestCase.verify(BaseGeoPointTestCase.java:763)
[junit4] > at org.apache.lucene.spatial.util.BaseGeoPointTestCase.testAllLonEqual(BaseGeoPointTestCase.java:495)
Hi @clintongormley, thank you for your message.
The stack trace is as follows:
RemoteTransportException[[elasticsearch_4][172.17.0.2:9300][indices:data/read/search[phase/fetch/id]]]; nested: FetchPhaseExecutionException[Fetch Failed [Failed to highlight field [cyberdyne_metadata.ner.mitie.model.DISEASE.tag]]]; nested: NumberFormatException[Invalid shift value (65) in prefixCoded bytes (is encoded value really a geo point?)]; Caused by: FetchPhaseExecutionException[Fetch Failed [Failed to highlight field [cyberdyne_metadata.ner.mitie.model.DISEASE.tag]]]; nested: NumberFormatException[Invalid shift value (65) in prefixCoded bytes (is encoded value really a geo point?)]; at org.elasticsearch.search.highlight.PlainHighlighter.highlight(PlainHighlighter.java:123) at org.elasticsearch.search.highlight.HighlightPhase.hitExecute(HighlightPhase.java:126) at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:188) at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:592) at org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:408) at org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:405) at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75) at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:300) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NumberFormatException: Invalid shift value (65) in prefixCoded bytes (is encoded value really a geo point?) at org.apache.lucene.spatial.util.GeoEncodingUtils.getPrefixCodedShift(GeoEncodingUtils.java:134) at org.apache.lucene.spatial.geopoint.search.GeoPointPrefixTermsEnum.accept(GeoPointPrefixTermsEnum.java:219) at org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:232) at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:67) at org.apache.lucene.search.ScoringRewrite.rewrite(ScoringRewrite.java:108) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:220) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:227) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:113) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:113) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:505) at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:218) at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:186) at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:195) at org.elasticsearch.search.highlight.PlainHighlighter.highlight(PlainHighlighter.java:108) ... 12 more
The field cyberdyne_metadata.ner.mitie.model.DISEASE.tag should not be a geopoint according to the dynamic template.
@rmuir oh, good catch @clintongormley The stack trace indeed suggests that the issue is with highlighting on the geo field. Regardless of this bug, I wonder that we should fail early when highlighting on anything but text fields and/or exclude non-text fields from wildcard matching.
I wonder that we should fail early when highlighting on anything but text fields and/or exclude non-text fields from wildcard matching.
+1 to fail early if the user explicitly defined a non text field to highlight on and exclude non text fields when using wildcards
I was running into this bug during a live demo... Yes I know, I've should have tested all demo scenario's after updating ES :grimacing: . Anyway, +1 for fixing this!
-I´m having the same error. It's happends with doc having location and trying to use "highlight": {... "require_field_match": false ...}
thanks!
I'm unclear as to what exactly is going on here, but I'm running into the same issue. I'm attempting to do a geo bounding box in Kibana while viewing the results in the Discover tab. Disabling highlighting in Kibana fixes the issue, but I would actually like to keep highlighting enabled, since it's super useful otherwise.
It sounds from what others are saying that this should fail when querying on any non-string field, but I am not getting the same failure on numeric fields. Is it just an issue with geoip fields? I suppose another nice thing would be to explicitly allow for configuration of which fields should be highlighted in Kibana.
Please fix this issue.
I wrote two tests so that everyone can reproduce what happens easily: https://github.com/brwe/elasticsearch/commit/ffa242941e4ede34df67301f7b9d46ea8719cc22
In brief:
The plain highlighter tries to highlight whatever the BBQuery provides as terms in the text "60,120" if that is how the geo_point
was indexed (if the point was indexed with {"lat": 60, "lon": 120}
nothing will happen because we cannot even extract anything from the source). The terms in the text are provided to Lucene as a token steam with a keyword analyzer.
In Lucene, this token stream is converted this via a longish call stack into a terms enum. But this terms enum is pulled from the query that contains the terms that are to be highlighted. In this case we call GeoPointMultiTermQuery.getTermsEnum(terms)
which wraps the term in a GeoPointTermsEnum
. This enum tries to convert a prefix coded geo term back to something else but because it is really just the string "60,120" it throws the exception we see.
I am unsure yet how a correct fix would look like but do wonder why we try highlingting on numeric and geo fields at all? If anyone has an opinion let me know.
I missed @jpountz comment:
Regardless of this bug, I wonder that we should fail early when highlighting on anything but text fields and/or exclude non-text fields from wildcard matching.
I agree. Will make a pr for that.
@brwe you did something similar before: https://github.com/elastic/elasticsearch/pull/11364 - i would have thought that that PR should have fixed this issue?
@clintongormley Yes you are right. #11364 only addresses problems one gets when the way text is indexed is not compatible with the highlighter used. I do not remember why I did not exclude numeric fields then.
Great work. Tnx
:sunglasses:
This is not fixed in 2.3.3 yet, correct?
@rodgermoore It should be fixed in 2.3.3, can you still reproduce the problem?
Ubuntu 14.04-04 Elasticsearch 2.3.3 Kibana 4.5.1 JVM 1.8.0_66
I am still able to reproduce this error in Kibana 4.5.1. I have a dashboard with a search panel with highlighting enabled. On the same Dashboard I have a tile map and after selecting an area in this map using the select function (draw a rectangle) I got the "Invalid shift value (xx) in prefixCoded bytes (is encoded value really a geo point?)" error.
When I alter the json settings file of the search panel and remove highlighting the error does not pop-up.
@rodgermoore I cannot reproduce this but I might do something different from you. Here is my dashboard:
Is that what you did? Can you attach the whole stacktrace from the elasticsearch logs again? If you did not change the logging config the full search request should be in there. Also, if you can please add an example document.
I see you used "text:blah". I did not enter a search at all (so used the default wildcard) and then did the aggregation on the tile map. This resulted in the error.
I can remove the query and still get a result. Can you please attach the relevant part of the elasticsearch log?
Here you go:
[2016-05-19 15:23:08,270][DEBUG][action.search ] [Black King] All shards failed for phase: [query_fetch]
RemoteTransportException[[Black King][192.168.48.18:9300][indices:data/read/search[phase/query+fetch]]]; nested: FetchPhaseExecutionException[Fetch Failed [Failed to highlight field [tags.nl]]]; nested: NumberFormatException[Invalid shift value (115) in prefixCoded bytes (is encoded value really a geo point?)];
Caused by: FetchPhaseExecutionException[Fetch Failed [Failed to highlight field [tags.nl]]]; nested: NumberFormatException[Invalid shift value (115) in prefixCoded bytes (is encoded value really a geo point?)];
at org.elasticsearch.search.highlight.PlainHighlighter.highlight(PlainHighlighter.java:123)
at org.elasticsearch.search.highlight.HighlightPhase.hitExecute(HighlightPhase.java:140)
at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:188)
at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:480)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(SearchServiceTransportAction.java:392)
at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryFetchTransportHandler.messageReceived(SearchServiceTransportAction.java:389)
at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NumberFormatException: Invalid shift value (115) in prefixCoded bytes (is encoded value really a geo point?)
at org.apache.lucene.spatial.util.GeoEncodingUtils.getPrefixCodedShift(GeoEncodingUtils.java:134)
at org.apache.lucene.spatial.geopoint.search.GeoPointPrefixTermsEnum.accept(GeoPointPrefixTermsEnum.java:219)
at org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:232)
at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:67)
at org.apache.lucene.search.ScoringRewrite.rewrite(ScoringRewrite.java:108)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:220)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:227)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:113)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:113)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:505)
at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:218)
at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:186)
at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:195)
at org.elasticsearch.search.highlight.PlainHighlighter.highlight(PlainHighlighter.java:108)
... 12 more
We are using dynamic mapping and we dynamically analyse all string fields using the Dutch language analyzer. All string fields get a non analyzed field: "field.raw" and a Dutch analyzed field "field.nl".
Ah...I was hoping to get the actual request but it is not in the stacktrace after all. Can you also add the individual requests from the panels in your dashboard (in the spy tab) and a screenshot so I can see what the geo bounding box filter filters on? I could then try to reconstruct the request.
Also, are you sure you upgraded all nodes in the cluster? Check with curl -XGET "http://hostname:port/_nodes"
. Would be great if you could add the output of that here too just to be sure.
I have got the exact same issue. I am running 2.3.3. All my nodes (1) are upgraded.
Here you go.
Tile Map Query:
{
"query": {
"filtered": {
"query": {
"query_string": {
"analyze_wildcard": true,
"query": "*"
}
},
"filter": {
"bool": {
"must": [
{
"geo_bounding_box": {
"SomeGeoField": {
"top_left": {
"lat": REMOVED,
"lon": REMOVED
},
"bottom_right": {
"lat": REMOVED,
"lon": REMOVED
}
}
},
"$state": {
"store": "appState"
}
},
{
"query": {
"query_string": {
"query": "*",
"analyze_wildcard": true
}
}
},
{
"range": {
"@timestamp": {
"gte": 1458485686484,
"lte": 1463666086484,
"format": "epoch_millis"
}
}
}
],
"must_not": []
}
}
}
},
"size": 0,
"aggs": {
"2": {
"geohash_grid": {
"field": "SomeGeoField",
"precision": 5
}
}
}
}
I'm using a single node cluster, here's the info:
{
"cluster_name": "elasticsearch",
"nodes": {
"RtBthRfeSOSud1XfRRAkSA": {
"name": "Black King",
"transport_address": "192.168.48.18:9300",
"host": "192.168.48.18",
"ip": "192.168.48.18",
"version": "2.3.3",
"build": "218bdf1",
"http_address": "192.168.48.18:9200",
"settings": {
"pidfile": "/var/run/elasticsearch/elasticsearch.pid",
"cluster": {
"name": "elasticsearch"
},
"path": {
"conf": "/etc/elasticsearch",
"data": "/var/lib/elasticsearch",
"logs": "/var/log/elasticsearch",
"home": "/usr/share/elasticsearch",
"repo": [
"/home/somename/es_backups"
]
},
"name": "Black King",
"client": {
"type": "node"
},
"foreground": "false",
"config": {
"ignore_system_properties": "true"
},
"network": {
"host": "0.0.0.0"
}
},
"os": {
"refresh_interval_in_millis": 1000,
"name": "Linux",
"arch": "amd64",
"version": "3.19.0-59-generic",
"available_processors": 8,
"allocated_processors": 8
},
"process": {
"refresh_interval_in_millis": 1000,
"id": 1685,
"mlockall": false
},
"jvm": {
"pid": 1685,
"version": "1.8.0_66",
"vm_name": "Java HotSpot(TM) 64-Bit Server VM",
"vm_version": "25.66-b17",
"vm_vendor": "Oracle Corporation",
"start_time_in_millis": 1463663018422,
"mem": {
"heap_init_in_bytes": 6442450944,
"heap_max_in_bytes": 6372720640,
"non_heap_init_in_bytes": 2555904,
"non_heap_max_in_bytes": 0,
"direct_max_in_bytes": 6372720640
},
"gc_collectors": [
"ParNew",
"ConcurrentMarkSweep"
],
"memory_pools": [
"Code Cache",
"Metaspace",
"Compressed Class Space",
"Par Eden Space",
"Par Survivor Space",
"CMS Old Gen"
],
"using_compressed_ordinary_object_pointers": "true"
},
"thread_pool": {
"force_merge": {
"type": "fixed",
"min": 1,
"max": 1,
"queue_size": -1
},
"percolate": {
"type": "fixed",
"min": 8,
"max": 8,
"queue_size": 1000
},
"fetch_shard_started": {
"type": "scaling",
"min": 1,
"max": 16,
"keep_alive": "5m",
"queue_size": -1
},
"listener": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": -1
},
"index": {
"type": "fixed",
"min": 8,
"max": 8,
"queue_size": 200
},
"refresh": {
"type": "scaling",
"min": 1,
"max": 4,
"keep_alive": "5m",
"queue_size": -1
},
"suggest": {
"type": "fixed",
"min": 8,
"max": 8,
"queue_size": 1000
},
"generic": {
"type": "cached",
"keep_alive": "30s",
"queue_size": -1
},
"warmer": {
"type": "scaling",
"min": 1,
"max": 4,
"keep_alive": "5m",
"queue_size": -1
},
"search": {
"type": "fixed",
"min": 13,
"max": 13,
"queue_size": 1000
},
"flush": {
"type": "scaling",
"min": 1,
"max": 4,
"keep_alive": "5m",
"queue_size": -1
},
"fetch_shard_store": {
"type": "scaling",
"min": 1,
"max": 16,
"keep_alive": "5m",
"queue_size": -1
},
"management": {
"type": "scaling",
"min": 1,
"max": 5,
"keep_alive": "5m",
"queue_size": -1
},
"get": {
"type": "fixed",
"min": 8,
"max": 8,
"queue_size": 1000
},
"bulk": {
"type": "fixed",
"min": 8,
"max": 8,
"queue_size": 50
},
"snapshot": {
"type": "scaling",
"min": 1,
"max": 4,
"keep_alive": "5m",
"queue_size": -1
}
},
"transport": {
"bound_address": [
"[::]:9300"
],
"publish_address": "192.168.48.18:9300",
"profiles": {}
},
"http": {
"bound_address": [
"[::]:9200"
],
"publish_address": "192.168.48.18:9200",
"max_content_length_in_bytes": 104857600
},
"plugins": [],
"modules": [
{
"name": "lang-expression",
"version": "2.3.3",
"description": "Lucene expressions integration for Elasticsearch",
"jvm": true,
"classname": "org.elasticsearch.script.expression.ExpressionPlugin",
"isolated": true,
"site": false
},
{
"name": "lang-groovy",
"version": "2.3.3",
"description": "Groovy scripting integration for Elasticsearch",
"jvm": true,
"classname": "org.elasticsearch.script.groovy.GroovyPlugin",
"isolated": true,
"site": false
},
{
"name": "reindex",
"version": "2.3.3",
"description": "_reindex and _update_by_query APIs",
"jvm": true,
"classname": "org.elasticsearch.index.reindex.ReindexPlugin",
"isolated": true,
"site": false
}
]
}
}
}
Screenshot, I had to clear out the data:
@rodgermoore does the query you provided work correctly? You said that it started working once you deleted the highlighting and this query doesn't contain highlighting. Could you provide the query that doesn't work?
It does has highlighting enabled. This is the json for the search panel:
{
"index": "someindex",
"query": {
"query_string": {
"query": "*",
"analyze_wildcard": true
}
},
"filter": [],
"highlight": {
"pre_tags": [
"@kibana-highlighted-field@"
],
"post_tags": [
"@/kibana-highlighted-field@"
],
"fields": {
"*": {}
},
"require_field_match": false,
"fragment_size": 2147483647
}
}
I can't show the actual data so I selected to show only the timestamp field in the search panel in the screenshot...
When I change the json of the search panel to:
{
"index": "someindex",
"filter": [],
"query": {
"query_string": {
"query": "*",
"analyze_wildcard": true
}
}
}
The error disappears.
If my understanding of the patch is correct, it shouldn't matter whether Kibana is including the highlighting field. Elasticsearch should only be trying to highlight string fields, even if a wildcard is being used.
Ok, I managed to reproduce it on 2.3.3. It happens with "geohash": true
in the mapping.
Steps are:
DELETE test
PUT test
{
"mappings": {
"doc": {
"properties": {
"point": {
"type": "geo_point",
"geohash": true
}
}
}
}
}
PUT test/doc/1
{
"point": "60.12,100.34"
}
POST test/_search
{
"query": {
"geo_bounding_box": {
"point": {
"top_left": {
"lat": 61.10078883158897,
"lon": -170.15625
},
"bottom_right": {
"lat": -64.92354174306496,
"lon": 118.47656249999999
}
}
}
},
"highlight": {
"fields": {
"*": {}
}
}
}
Sorry, I did not think of that. I work on another fix.
I took a closer look and it turns out that excluding fields did not fix the issue at all, just in my test by coincidence. The problem is not that we highlight on a non text field but instead that we highlight on text fields and highlighting does not seem to work well with GeoPointInBBoxQuery. After a discussion with @mikemccand and @nknize I opened an issue for Lucene: https://issues.apache.org/jira/browse/LUCENE-7293
I will try to find on a workaround for elasticsearch. @nik9000 pointed out that we might get away with overwriting WeightedSpanTermExtractor.extract() in CustomWeightedSpanTermExtractor and just ignore all geo related queries for now. I will try that out.
Hi @brwe , I am using ES 2.4.0, and the query is of below nature, and I still get that error. So, is this issue still open?
{
"query": {
"function_score": {
"query": {
"bool" : {
"filter" : [ {"geo_bounding_box" : { ... } ]
}
},
"functions": [ { "gauss": { "Location_GP": { ... } } } ]
}
},
"highlight": { }
}
@ajayar I cannot reproduce the failure on 2.4.1 Could you paste the whole query including the actual geo points and the full error message please? Stacktrace should be in the elasticsearch log.
Sorry for the delay. Please check the details below:
DELETE test_idx
PUT test_idx
{
"mappings" : {
"jobs" : {
"_all": {"enabled":false},
"properties" : {
"jd" : {"type":"string"},
"loc" : {"type":"geo_point", "lat_lon": true}
}
}
}
}
POST test_idx/jobs/1
{
"jd" : "some आवश्यकता है- आर्य समाज अनाथालय, 68 सिविल लाइन्स, बरेली को एक पुरूष रस text ",
"loc" : "12.934059 ,77.610741"
}
GET test_idx/jobs/_search
{
"query": {
"function_score": {
"query": {
"bool" : {
"filter" : [
{"geo_bounding_box" : { "loc" : { "top" : 48.934059, "left" : 41.610741, "bottom" : -23.065941, "right" : 113.610741 }}}
]
}
},
"functions": [
{
"gauss": {
"loc": {
"origin": "12.934059 ,77.610741",
"scale": "4000km",
"decay": 0.7
}
}
}
]
}
},
"highlight": {
"fields": {
"jd": {}
}
}
}
---- Logs ---
RemoteTransportException[[Grizzly][127.0.0.1:9300][indices:data/read/search[phase/fetch/id]]]; nested: FetchPhaseExecutionException[Fetch Failed [Failed to highlight field [jd]]]; nested: NumberFormatException[Invalid shift value (-92) in prefixCoded bytes (is encoded value really a geo point?)];
Caused by: FetchPhaseExecutionException[Fetch Failed [Failed to highlight field [jd]]]; nested: NumberFormatException[Invalid shift value (-92) in prefixCoded bytes (is encoded value really a geo point?)];
at org.elasticsearch.search.highlight.PlainHighlighter.highlight(PlainHighlighter.java:123)
at org.elasticsearch.search.highlight.HighlightPhase.hitExecute(HighlightPhase.java:140)
at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:188)
at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:592)
at org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:408)
at org.elasticsearch.search.action.SearchServiceTransportAction$FetchByIdTransportHandler.messageReceived(SearchServiceTransportAction.java:405)
at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NumberFormatException: Invalid shift value (-92) in prefixCoded bytes (is encoded value really a geo point?)
at org.apache.lucene.spatial.util.GeoEncodingUtils.getPrefixCodedShift(GeoEncodingUtils.java:134)
at org.apache.lucene.spatial.geopoint.search.GeoPointPrefixTermsEnum.accept(GeoPointPrefixTermsEnum.java:219)
at org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:232)
at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:67)
at org.apache.lucene.search.ScoringRewrite.rewrite(ScoringRewrite.java:108)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:220)
at org.elasticsearch.search.highlight.CustomQueryScorer$CustomWeightedSpanTermExtractor.extract(CustomQueryScorer.java:95)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:146)
at org.elasticsearch.search.highlight.CustomQueryScorer$CustomWeightedSpanTermExtractor.extract(CustomQueryScorer.java:95)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:109)
at org.elasticsearch.search.highlight.CustomQueryScorer$CustomWeightedSpanTermExtractor.extract(CustomQueryScorer.java:95)
at org.elasticsearch.search.highlight.CustomQueryScorer$CustomWeightedSpanTermExtractor.extractUnknownQuery(CustomQueryScorer.java:82)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:230)
at org.elasticsearch.search.highlight.CustomQueryScorer$CustomWeightedSpanTermExtractor.extract(CustomQueryScorer.java:95)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:227)
at org.elasticsearch.search.highlight.CustomQueryScorer$CustomWeightedSpanTermExtractor.extract(CustomQueryScorer.java:95)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:227)
at org.elasticsearch.search.highlight.CustomQueryScorer$CustomWeightedSpanTermExtractor.extract(CustomQueryScorer.java:95)
at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:505)
at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:218)
at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:186)
at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:195)
at org.elasticsearch.search.highlight.PlainHighlighter.highlight(PlainHighlighter.java:108)
... 12 more
`
@ajayar Thanks a lot for the example, that was very helpful. The example works fine for me on 2.4 but on 2.3.5 (where this error should not occur either) the query fails for me. I will open another pr to address this.
However, we must still figure out why you get this error on 2.4. too.
Can you run the example again and then also add the output of
GET test_idx/_settings?human
Hi @brwe thanks a lot for looking into it. Please check the output of the query
GET test_idx/_settings?human
{
"test_idx": {
"settings": {
"index": {
"creation_date": "1474005889110",
"uuid": "a2pkzpAnSk-o-c-ueXWHgQ",
"creation_date_string": "2016-09-16T06:04:49.110Z",
"number_of_replicas": "1",
"number_of_shards": "5",
"version": {
"created_string": "2.3.4",
"created": "2030499"
}
}
}
}
}
@ajangus you are saying you are using 2.4 but this index is created with 2.3.4 maybe you are missing something on your end?
I too noticed the version, but am not sure why it is so. I am sure that I am running 2.4. But, I will recheck the same on a different machine and come back on this.
@ajayar I am sure you are not creating this index with a cluster that has only 2.4 nodes. if you have a single 2.3.4 node in the cluster this will happen. Maybe you can just paste the output of GET localhost:9200/
(the main action) and the node stats
Apologies. Yes, you are right it was indeed pointing to 2.3.4. Had my scripts pointing to a wrong node. I don't see the issue happening on 2.4
@brwe I think we can close this again! thanks for reporting back @ajayar
yes. @ajayar thanks again for the detailed report!
Hi guys,
we have upgraded ElasticSearch from 2.3.0 and reindexed our geolocations so the latitude and longitude are stored separately. We have noticed that some of our visualisation started to fail after we add a filter based on geolocation rectangle. However, map visualisation are working just fine. The problem occurs when we include actual documents. In this case, we get some failed shards (usually 1 out of 5) and error: Invalid shift value (xx) in prefixCoded bytes (is encoded value really a geo point?).
Details: Our geolocation index is based on:
The ok query with the error is as follows. If we change the query size to 0 (map visualizations example), the query completes without problem.
Elasticsearch version: 2.3.0 OS version: Elasticsearch docker image with head plugin, marvel and big desk installed
Thank you for your help, regards, Jakub Smid