[BUG] Runtime error when using explain=true with multiple script_score neural queries (Null score for the docID: 2147483647)

What is the bug?

When:

using multiple script_score neural queries on multiple (different) vector fields, like in this comment
each script references _score
explain=true

Then, if a document is returned by some neural field queries (within the sub-query's top-k) but not some others, the query fails with a script runtime exception and the error: Null score for the docID: 2147483647

(At least I think this is why... I'm new to OpenSearch and neural search, so apologies - my explanation for why this happens is just my best guess!)

How can one reproduce the bug?

Follow the docs instructions to set up neural search.
Set up two fields like title_embedding and description_embedding.
Ingest some documents (their embedding fields should by set in the ingest pipeline) - the example query below should have 100 documents
Run a query like:

GET /myindex/_search?explain=true
{
  "from": 0,
  "size": 100,
  "query": {
    "bool" : {
      "should" : [
        {
          "script_score": {
            "query": {
              "neural": {
                "title_embedding": {
                  "query_text": "test",
                  "model_id": "xGbq_YcB3ggx1CR0Nfls",
                  "k": 10
                }
              }
            },
            "script": {
              "source": "_score * 1"
            }
          }
        },
        {
          "script_score": {
            "query": {
              "neural": {
                "description_embedding": {
                  "query_text": "test",
                  "model_id": "xGbq_YcB3ggx1CR0Nfls",
                  "k": 10
                }
              }
            },
            "script": {
              "source": "_score * 1"
            }
          }
        }
      ]
    }
  }
}

See an error like:

{
  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "org.opensearch.knn.index.query.KNNScorer.score(KNNScorer.java:51)",
          "org.opensearch.script.ScoreScript.lambda$setScorer$4(ScoreScript.java:156)",
          "org.opensearch.script.ScoreScript.get_score(ScoreScript.java:168)",
          "_score * 1",
          "^---- HERE"
        ],
        "script": "_score * 1",
        "lang": "painless",
        "position": {
          "offset": 0,
          "start": 0,
          "end": 10
        }
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "opensearch_content",
        "node": "vnyA5s-aQUOmTj6IHosYXA",
        "reason": {
          "type": "script_exception",
          "reason": "runtime error",
          "script_stack": [
            "org.opensearch.knn.index.query.KNNScorer.score(KNNScorer.java:51)",
            "org.opensearch.script.ScoreScript.lambda$setScorer$4(ScoreScript.java:156)",
            "org.opensearch.script.ScoreScript.get_score(ScoreScript.java:168)",
            "_score * 1",
            "^---- HERE"
          ],
          "script": "_score * 1",
          "lang": "painless",
          "position": {
            "offset": 0,
            "start": 0,
            "end": 10
          },
          "caused_by": {
            "type": "runtime_exception",
            "reason": "Null score for the docID: 2147483647"
          }
        }
      }
    ]
  },
  "status": 400
}

Note the high size and low k. You might need to adjust the query_text or k to find a combination where a document is returned in one neural query's top k and not the other.

Remove explain=true from the query and notice it succeeds.

What is the expected behavior?

The query succeeds - it does not throw an error.
_score for the affected field is 0 or the affected field is excluded entirely - either way, the _explanation should accurately reflect this.

What is your host/environment?

OpenSearch 2.7, Ubuntu 22.04.

Do you have any additional context?

I'm not sure why it only happens with explain=true. (I can't explain it)

It also only happens if using script_score. If using multiple neural queries directly, there is no error. But then there is no score per-field in _explanation - the total is correct, but each field score value is reported as 1. https://github.com/opensearch-project/k-NN/issues/875 describes this problem. My use case is: I'd like to try using the similarity scores of each field as features in a Learning to Rank model, which means I need to get each score individually.

Just to add, I'm using nmslib with a field mapping like this:

    "title_embedding": {
      "type": "knn_vector",
      "dimension": 384,
      "method": {
        "name": "hnsw",
        "space_type": "l2",
        "engine": "nmslib",
        "parameters": {
          "ef_construction": 128,
          "m": 24
        }
      }
    }

I've just tested using the Lucene engine and the error does not occur with lucene. (as an aside, with lucene, the _explanation values are all filled in properly without having to use a script for the neural queries)

Explain logic not really supported in both neural-search and knn (that does the work under the hood). In neural-search explain functionality is not implemented, and knn has a mock implementation that returns a constant KNNWeight.

Most probably the error you're seeing is a result of those mock results bubbled to the high level query like bool. While we should investigate the error, most like the explain not be fixed in a nearest future.

Thanks @martin-gaievski!

I'm ok if there is no detailed explain logic. My bug is just about the error being thrown when using the _score value, which means you can't use explain at all, which means you can't get the calculated distance per-field.

For example, here is what's shown for a successful query (with explain=true but without the bug described above):

{
  "hits": {
    "max_score": 0.81763434,
    "hits": [
      {
        <trimmed>
        "_score": 0.81763434,
        "_source": {},
        "_explanation": {
          "value": 0.81763434,
          "description": "sum of:",
          "details": [
            {
              "value": 0.42870614,
              "description": "script score function, computed with script:\"Script{type=inline, lang='painless', idOrCode='_score * 1', options={}, params={}}\"",
              "details": [
                {
                  "value": 1,
                  "description": "_score: ",
                  "details": [
                    {
                      "value": 1,
                      "description": "No Explanation",
                      "details": []
                    }
                  ]
                }
              ]
            },
            {
              "value": 0.38892817,
              "description": "script score function, computed with script:\"Script{type=inline, lang='painless', idOrCode='_score * 1', options={}, params={}}\"",
              "details": [
                {
                  "value": 1,
                  "description": "_score: ",
                  "details": [
                    {
                      "value": 1,
                      "description": "No Explanation",
                      "details": []
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    ]
  }
}

I think you're referring to the details showing a constant 1 and No Explanation. That part is okay because I was trying to work around it in a different way - by using a script_score query and referring to _score.

So in this example, that works! The important field is _explanation -> details -> value - 0.42870614 and 0.38892817. These are the individual score values for the two neural queries. We can ignore the constant/'No explanation' details further down.

But that revealed the bug: referring to _score in a script_score + neural query will sometimes throw an error when explain=true.

The reason I think it's a bug is because it only throws an error when explain=true. Without explain, there is no error, and the document combined _score is as expected. That doesn't make sense to me, because I'd expect the line in KNNScorer::score() that throws the error with explain=true to also throw the error when calculating the document score. But that doesn't seem to be the case.

So overall, without explain I don't think it's possible to get the distance score of each vector field separately. The best we can get is the document's combined _score.

At least I think that is correct? If there's a different way of running multiple neural queries on multiple fields and getting the score of each one separately, I would love to know!

For now, I'm ok using the Lucene engine, where the bug doesn't occur. (In Lucene, if a doc appears in the top-k of one query but not the other, the non-matching query won't have any entry in _explanation, which is what I expect)

transferring to knn as it seems its a knn issue

[Catch All Triage - 1, 2, 3, 4]

@vamshin could you help add assignee?

opensearch-project / k-NN