opensearch-project / ml-commons

ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Apache License 2.0
95 stars 129 forks source link

[BUG] Exception when inference processors has full_response path is false with no output mapping #2943

Closed mingshl closed 4 weeks ago

mingshl commented 1 month ago

What is the bug?

When configuring ml inference processor with full_response_path is false with no output mapping throw exceptions "reason": "An unexpected error occurred: model inference output cannot find field name: $.inference_results"

A sample inference result is as following example:

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "response": """ Based on the given context of ["hello world","hi earth","howdy there"], here is a summary of the documents: All three documents appear to be simple greetings using different words."""
          }
        }
      ],
      "status_code": 200
    }
  ]
}

When ml inference search response processor, configuring no output mapping, then the processors will add the default output mapping as following that will add a field inference_result to the documents or search hits, reading from the json path $.inference_result in the prediction results.

{
  "output_map": {
    "inference_result": "$.inference_result"
  }
}

However, when configuring full_response is false, the processor will try to travel within inference_results.output into the single tensor to look for inner mapping. In this case,

the single tensor is as following, we already passed through inference_results.output thus ml inference processor cannot find the default model output path . $.inference_results

{
          "name": "response",
          "dataAsMap": {
            "response": """ Based on the given context of ["hello world","hi earth","howdy there"], here is a summary of the documents: All three documents appear to be simple greetings using different words."""
          }
        }

How can one reproduce the bug? Steps to reproduce the behavior:

POST /_plugins/_ml/connectors/_create
{
  "name": "Amazon Bedrock Connector: Claude Instant V1",
  "version": "1",
  "description": "The connector to bedrock Claude model",
  "protocol": "aws_sigv4",
  "parameters": {
    "max_tokens_to_sample": "8000",
    "service_name": "bedrock",
    "temperature": "1.0E-4",
    "response_filter": "$.completion",
    "region": "us-west-2",
    "anthropic_version": "bedrock-2023-05-31",
    "inputs":"Please summarize the documents"
  },
  "credential": {
    "access_key": "",
    "secret_key": "",
    "session_token": "",
    "actions": [
      {
        "action_type": "PREDICT",
        "method": "POST",
        "url": "https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-instant-v1/invoke",
        "headers": {
          "x-amz-content-sha256": "required",
          "content-type": "application/json"
        },
        "request_body":  "{\"prompt\":\"${parameters.prompt}\",\"max_tokens_to_sample\":300,\"temperature\":0.5,\"top_k\":250,\"top_p\":1,\"stop_sequences\":[\"\\n\\nHuman:\"]}"
      }
  ]
}

POST /_plugins/_ml/models/_register
{
    "name": "Bedrock LLM model v3",
    "function_name": "remote",
    "description": "Bedrock Claude model",
    "connector_id": "N4F86JEB2YxMNDbhXWFw"
}

POST /_plugins/_ml/models/OoF86JEB2YxMNDbhl2F-/_deploy

POST /_plugins/_ml/models/OoF86JEB2YxMNDbhl2F-/_predict
{
  "parameters": {
    "context": ["hello world", "hi earth", "howdy there"],
    "prompt":"\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${parameters.context.toString()}. \n\n Human: please summarize the documents \n\n Assistant:"
  }
}

index config

PUT /test-index/_doc/1
{
  "review": "great time"
}
PUT /test-index/_doc/2
{
  "review": "terrible time"
}

search (full_response_path is true, succeeds)

GET /test-index/_search
{
  "query": {
    "match_all": {}
  },
  "size": 1000,
  "search_pipeline": {
    "request_processors": [],
    "response_processors": [
      {
        "ml_inference": {
          "model_id": "",
          "input_map": [
            {
              "context": "review"
            }
          ],
          "one_to_one": false,
          "full_response_path": true,
          "ignore_missing": false,
          "ignore_failure": false,
          "model_config": {
            "prompt": "\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${parameters.context.toString()}. \n\n Human: please summarize the documents \n\n Assistant:"
          }
        }
      }
    ]
  }
}

response

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test-index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "review": "great time",
          "inference_results": [
            {
              "output": [
                {
                  "name": "response",
                  "dataAsMap": {
                    "response": """ Based on the given context of ["great time","terrible time"], here is a summary:

This context contains two documents - "great time" and "terrible time". "great time" suggests a positive experience or event. "terrible time" suggests a negative experience or event. Without more details provided in the context, I cannot determine more specifics about the documents. The context simply indicates one document had a positive viewpoint while the other had a negative viewpoint."""
                  }
                }
              ],
              "status_code": 200
            }
          ]
        }
      },
      {
        "_index": "test-index",
        "_id": "2",
        "_score": 1,
        "_source": {
          "review": "terrible time",
          "inference_results": [
            {
              "output": [
                {
                  "name": "response",
                  "dataAsMap": {
                    "response": """ Based on the given context of ["great time","terrible time"], here is a summary:

This context contains two documents - "great time" and "terrible time". "great time" suggests a positive experience or event. "terrible time" suggests a negative experience or event. Without more details provided in the context, I cannot determine more specifics about the documents. The context simply indicates one document had a positive viewpoint while the other had a negative viewpoint."""
                  }
                }
              ],
              "status_code": 200
            }
          ]
        }
      }
    ]
  }
}

search (full_response_path is false, fails)

GET /test-index/_search
{
  "query": {
    "match_all": {}
  },
  "size": 1000,
  "search_pipeline": {
    "request_processors": [],
    "response_processors": [
      {
        "ml_inference": {
          "model_id": "",
          "input_map": [
            {
              "context": "review"
            }
          ],
          "one_to_one": false,
          "full_response_path": false,
          "ignore_missing": false,
          "ignore_failure": false,
          "model_config": {
            "prompt": "\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${parameters.context.toString()}. \n\n Human: please summarize the documents \n\n Assistant:"
          }
        }
      }
    ]
  }

response:

{
  "error": {
    "root_cause": [
      {
        "type": "runtime_exception",
        "reason": "An unexpected error occurred: model inference output cannot find field name: $.inference_results"
      }
    ],
    "type": "runtime_exception",
    "reason": "An unexpected error occurred: model inference output cannot find field name: $.inference_results"
  },
  "status": 500
}

What is the expected behavior? when inference processors has no output mapping,

when full_response_path is true, should default add the the entire response from the predict API

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "response": """ Based on the given context of ["hello world","hi earth","howdy there"], here is a summary of the documents: All three documents appear to be simple greetings using different words."""
          }
        }
      ],
      "status_code": 200
    }
  ]
}

when full_response_path is false, should add the prediction results within the tensors

 {
            "response": """ Based on the given context of ["hello world","hi earth","howdy there"], here is a summary of the documents: All three documents appear to be simple greetings using different words."""
          }

What is your host/environment?

Do you have any screenshots? If applicable, add screenshots to help explain your problem.

Do you have any additional context? Add any other context about the problem.

mingshl commented 4 weeks ago

fixed in #2944 , closing