elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.42k stars 24.87k forks source link

Wrong document score of nested knn query #110316

Closed AlexanderOtt85 closed 4 months ago

AlexanderOtt85 commented 4 months ago

Elasticsearch Version

8.14.1

Installed Plugins

No response

Java Version

bundled

OS Version

docker-image = docker.elastic.co/elasticsearch/elasticsearch:8.14.1

Problem Description

For a nested knn query with score_mode = max the document score does not match the expected score. I would expect the document score = 0.5852207 which is the maximum score of the inner hit (see 'nested knn query' in '_elasticsearch_8_14_1_knn_queryscore.txt').

For a nested query string query however the document score is the maximum score of the inner hit (see '_nested querystring query' in '_elasticsearch_8_14_1_knn_queryscore.txt').

Steps to Reproduce

elasticsearch_8_14_1_knn_query_score.txt

Logs (if relevant)

No response

elasticsearchmachine commented 4 months ago

Pinging @elastic/es-search (Team:Search)

elasticsearchmachine commented 4 months ago

Pinging @elastic/es-search-relevance (Team:Search Relevance)

benwtrent commented 4 months ago

The default for nested vector is indeed max and that is the only score mode supported. However other nested queries can support other modes. See the docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html#nested-top-level-params

The default is avg for all queries other than knn.

Could you provide data and minimal steps for replication?

Is the scoring just weird in the explain or is it the actual _score of the doc?

AlexanderOtt85 commented 4 months ago

The actual _score of the doc is incorrect. It should be possible to reproduce with the requests in https://github.com/user-attachments/files/16048978/elasticsearch_8_14_1_knn_query_score.txt

benwtrent commented 4 months ago

@AlexanderOtt85 Those queries are exceptionally complex and don't contain any test data nor the responses that were returned.

All the testing I have done shows that scoring is working just fine.

Could you provide the responses for your queries? Possibly some test data that replicates the issue?

AlexanderOtt85 commented 4 months ago

Sorry. The attachment now contains 4 requests (search for ###).

  1. Create index
  2. Index document
  3. knnQuery where the score is not OK. expected score = 0.5852207 current score = 0.5711336
  4. queryStringQuery where the score is OK. expected score = 1.2144799 current score = 1.2144799

elasticsearch_8_14_1_knn_query_score.txt

benwtrent commented 4 months ago

@AlexanderOtt85

I used the doc in your payload and indexed it and got the following:

"_score": 0.5852207,

"_explanation": {
                    "value": 0.5852207,
                    "description": "Score based on 1 child docs in range from 0 to 377, using score mode Max",
                    "details": [...]
            "inner_hits": {
                    "du.medium.keyframe_tc_hit": {
                        "hits": {
                            "total": {
                                "value": 352,
                                "relation": "eq"
                            },
                            "max_score": 0.5852207,
                            "hits": [
                                {
                                    "_index": "my_index",
                                    "_id": "YmkkuJABEnKFO-LMzsrO",
                                    "_nested": {
                                        "field": "du",
                                        "offset": 1,
                                        "_nested": {
                                            "field": "medium",
                                            "offset": 1,
                                            "_nested": {
                                                "field": "keyframe_tc_hit",
                                                "offset": 8
                                            }
                                        }
                                    },
                                    "_score": 0.5852207
                                },

Everything checks out.

What hardware are you testing this on?

Is it a single node cluster?

benwtrent commented 4 months ago

@AlexanderOtt85 I figured out whats going on while fall asleep. A good night's sleep is the best debugger.

This indeed is a bug, and it is fixed in 8.15 (this is why I didn't see it when I tested).

It surfaces in the following scenario:

So, to fix your particular weird scenario, the answer is:

Sorry for the inconsistency here!

AlexanderOtt85 commented 4 months ago

@benwtrent thanks for the clarification. I also get the expected score of _score": 0.5852207 with index_options: {type: "hnsw"} and 8.14.2

I think we will wait for 8.15... when can we expect a release 8.15

benwtrent commented 4 months ago

@AlexanderOtt85 8.15 release will be the next one. No public date is set.

AlexanderOtt85 commented 1 month ago

@benwtrent I just updated to 8.15.3 and am testing it. The maximum score of the InnerHits is now returned as a score. However, the score changed between 8.14 and 8.15. You already mentioned that. I would still expect to get a hit with 100% or a score of 1 if I search with exactly one vector that I have indexed. But this is not the case. Is this a known bug?