Open vijay267 opened 1 week ago
Pinging @elastic/es-analytical-engine (Team:Analytics)
Thanks very much for your interest in Elasticsearch.
Quoting the bug report form:
Please also check your OS is supported, and that the version of Elasticsearch has not passed end-of-life. If you are using an unsupported OS or an unsupported version then the issue is likely to be closed.
Your issue relates to Elasticsearch version 7.10 which has passed end-of-life, so I am closing this.
If you're able to reproduce this issue in a supported version, we will reopen this and will be very happy to assist. When you do so, please make sure to include a full list of steps to reproduce, including relevant index mappings and indexed documents, as well as the query response containing the error/wrong results.
I reproduced this exact same issue on ElasticSearch 8.16.0 so I would like to reopen this bug report Alex.
For this request - VariableWidthHistogramRequest.txt - I get this response - VariableWidthHistogramResponse.txt As you can see the top hits are putting documents with the wrong score into certain buckets. For example the bucket for scores 1108.385 - 1108.385 has a document (id = 5884592-en) with score 4.5.
Here are the index settings & mappings I'm using. EntityIndexMappings.txt EntityIndexSettings.txt (url)
Here are the sample docs used in the request: doc-111790-en.txt doc-5829842-en.txt doc-5829843-en.txt doc-5878933-en.txt doc-5884592-en.txt doc-8221094-en.txt
Let me know if you need any more information.
Elasticsearch Version
7.10.2
Installed Plugins
analysis-kuromoji, analysis-nori, analysis-smartcn, analysis-stconvert, analysis-stempel
Java Version
15
OS Version
Darwin Kernel Version 21.6.0
Problem Description
When I try to execute a variable width histogram (on score) with a nested top hits histogram, top hits ends up showing the WRONG documents inside each of the buckets.
So for example if I have a variable width histogram and the bucket scores end up as 0-10 15-100 130-150, the top hits subaggregation will often show the documents with scores of say 140 in the 0-10 bucket. Given I don't have this problem with either the range aggregation or the normal histogram, this seems like a bug. (I don't have ElasticSearch 8.0 to test, so not sure if the bug reproduces there or not).
Steps to Reproduce
// A simple full text query above works fine on a couple of documents
"aggs": { "scores": { // "histogram": { // This instead of the variable width histogram works with no issue // "script": "_score", // "interval": 2, // "min_doc_count": 1 // }, "variable_width_histogram": { "script": "_score", "buckets": 3 }, "aggs": { "top_hits_agg": { "top_hits": { "sort": [ { "_score": { "order": "desc" } } ], "_source": { "includes": [ "_id", "full_kg_p_locationbusiness_name_en", "numeric_kg_p_locationcustom70452number0" ], "excludes": ["vector"] }, "from": 0, "size": 5 } } } } }
Logs (if relevant)
No response