opensearch-project / neural-search

Plugin that adds dense neural retrieval into the OpenSearch ecosytem
Apache License 2.0
54 stars 58 forks source link

Neural Search Leads to "modelId is marked non-null but is null" when Targeting Multiple Indices #759

Open imbarazz opened 1 month ago

imbarazz commented 1 month ago

Note: this error was observed on OpenSearch 2.11 running on AWS cloud.

Performing a neural search against an alias, or performing a multi-search with multiple indices in a single header leads to the following error:

"null_pointer_exception: modelId is marked non-null but is null".

This is problematic when searching across different indices, each with their own embedding model.

Reproduction

Search Pipeline for Embedder Model 1

PUT /_search/pipeline/embed_pipeline_1

{
  "request_processors": [
    {
      "neural_query_enricher": {
        "neural_field_default_id": {
          "common_vector_field": "embed_model_id_1"
        }
      }
    }
  ]
}

Search Pipeline for Embedder Model 2

PUT /_search/pipeline/embed_pipeline_2

{
  "request_processors": [
    {
      "neural_query_enricher": {
        "neural_field_default_id": {
          "common_vector_field": "embed_model_id_2"
        }
      }
    }
  ]
}

Update Index 1 with Pipeline 1

PUT /index1/_settings

{
  "index.search.default_pipeline" : "embed_pipeline_1"
}

Update Index 2 with Pipeline 2

PUT /index2/_settings

{
  "index.search.default_pipeline" : "embed_pipeline_2"
}

Perform Multi-Search

GET /_msearch

{
  "index": [
    "index1",
    "index2"
  ]
}
{
  "query": {
    "neural": {
      "common_vector_field": {
        "query_text": "How do I perform a neural multi-search when dealing with multiple indices?",
        "k": 5
      }
    }
  },
  "from": 0,
  "size": 5
}
dblock commented 6 days ago

Catch All Triage - 1 2 3 4 5 6