opensearch-project / ml-commons

ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Apache License 2.0
98 stars 136 forks source link

[BUG] model_config prefix is changed in payload for ML inference processors #2822

Open mingshl opened 3 months ago

mingshl commented 3 months ago

What is the bug?

since this change in https://github.com/opensearch-project/ml-commons/commit/7cd52915d04d8ac7ddb6e37a74a256603587ce69 in ModelExecutor, the fields in model_config prefix is changed in payload.

before this commit, when using the model_config parameters, the parameter are usually prefix with parameters, for example parameters.prompt.

after this commit, when using the model_config parameters, the parameter will be prefix with 'model_config', for example, model_config.prompt.

now that using ml inference processor with model_config, on both search request, response and ingest, for remote model, we need to set model_input to rewrite the prefix, or the model_config won't be recognized by the connector response body

The pros about the parameters prefix is that it can identify the same parameters name within input_map and model_config when compiling model_input. However, for remote models, the inference processor passed over config such as model_config.prompt, it won't be recognized in the response body, because in the connector, all parameters are prefix with parameter, for example parameters.prompt.

How can one reproduce the bug? Steps to reproduce the behavior:

POST /_plugins/_ml/connectors/_create
{
  "name": "Amazon Bedrock Connector: Claude Instant V1",
  "version": "1",
  "description": "The connector to bedrock Claude model",
  "protocol": "aws_sigv4",
  "parameters": {
    "max_tokens_to_sample": "8000",
    "service_name": "bedrock",
    "temperature": "1.0E-4",
    "response_filter": "$.completion",
    "region": "us-west-2",
    "anthropic_version": "bedrock-2023-05-31",
    "inputs":"please summerize the documents"
  },
  "credential": {},
  "actions": [
    {
      "action_type": "PREDICT",
      "method": "POST",
      "url": "https://bedrock-runtime.us-west-2.amazonaws.com/model/anthropic.claude-instant-v1/invoke",
      "headers": {
        "x-amz-content-sha256": "required",
        "content-type": "application/json"
      },
      "request_body":  "{\"prompt\":\"${parameters.prompt}\",\"max_tokens_to_sample\":300,\"temperature\":0.5,\"top_k\":250,\"top_p\":1,\"stop_sequences\":[\"\\n\\nHuman:\"]}"
    }
  ]
}

PUT /review_string_index/_doc/1
{
  "review": "Dr. Eric is a fantastic doctor",
  "label":"5 stars"
}

PUT /review_string_index/_doc/2
{
  "review": "happy visit" ,
  "label":"5 stars"
}

PUT /review_string_index/_doc/3
{
  "review": "sad place" ,
  "label":"1 stars"
}

example 1, this is failing:

PUT /_search/pipeline/my_pipeline_request_review_llm
{
  "response_processors": [
    {
      "ml_inference": {
        "tag": "ml_inference",
        "description": "This processor is going to run llm",
        "model_id": "cf46K5EBoVpekzRp8x_3",
        "function_name": "REMOTE",
        "input_map": [
          {
            "context": "review"
          }
        ],
        "output_map": [
          {
            "llm_response": "response"
          }
        ],
        "model_config": {
          "prompt":"\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${input_map.context}. \n\n Human: please summarize the documents \n\n Assistant:"
        },
        "ignore_missing": false,
        "ignore_failure": false
      }
    }
  ]
}

example 2, this is success when we rewrite the model_config.prompt into parameters.prompt

PUT /_search/pipeline/my_pipeline_request_review_llm
{
  "response_processors": [
    {
      "ml_inference": {
        "tag": "ml_inference",
        "description": "This processor is going to run llm",
        "model_id": "cf46K5EBoVpekzRp8x_3",
        "model_input": "{ \"prompt\": \"${model_config.prompt}\"}",
        "function_name": "REMOTE",
        "input_map": [
          {
            "context": "review"
          }
        ],
        "output_map": [
          {
            "llm_response": "response"
          }
        ],
        "model_config": {
          "prompt":"\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${input_map.context}. \n\n Human: please summarize the documents \n\n Assistant:"
        },
        "ignore_missing": false,
        "ignore_failure": false
      }
    }
  ]
}

What is the expected behavior? example 1 should be succeed.

What is your host/environment?

Do you have any screenshots? If applicable, add screenshots to help explain your problem.

Do you have any additional context? Add any other context about the problem.

mingshl commented 3 months ago

maybe one way to fix this bug is to move the reconstructing payload within local model only @rbhavna please share your thoughts in this fix?

move https://github.com/opensearch-project/ml-commons/blob/9663053d544b59ef238ae36bf49b774543552df5/plugin/src/main/java/org/opensearch/ml/processor/ModelExecutor.java#L84-L80

mingshl commented 2 months ago

This issue is blocking using custom prompt in the ml inference response processor,

I have another idea to make this genAI use case to work,

when the model_config contains the the field from input_map and output_map, we should try to cast the datatype into escape string.

for example, ${input_map.context} in going to be in the substring of ${model_config.prompt}, we can try cast the ${input_map.context} into escape string or it won't be a valid payload later.

@rbhavna can you try this case using the latest main branch and reproduce this issue?

PUT /_search/pipeline/my_pipeline_request_review_llm
{
  "response_processors": [
    {
      "ml_inference": {
        "tag": "ml_inference",
        "description": "This processor is going to run llm",
        "model_id": "cf46K5EBoVpekzRp8x_3",
        "model_input": "{ \"prompt\": \"${model_config.prompt}\"}",
        "function_name": "REMOTE",
        "input_map": [
          {
            "context": "review"
          }
        ],
        "output_map": [
          {
            "llm_response": "response"
          }
        ],
        "model_config": {
          "prompt":"\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${input_map.context}. \n\n Human: please summarize the documents \n\n Assistant:"
        },
        "ignore_missing": false,
        "ignore_failure": false
      }
    }
  ]
}