opensearch-project / ml-commons

ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Apache License 2.0
86 stars 120 forks source link

[BUG] post_process_function of painless script does not work while creating externally hosted models using http #1990

Closed ramda1234786 closed 5 months ago

ramda1234786 commented 5 months ago

What is the bug? post_process_function of painless script does not work while creating externally hosted models using http. This shows up in predict API

How can one reproduce the bug? Take any text2text generation inference API endpoint model from Hugging Face. Create http request and add below mentioned post_process_function but it does not work as expected.

The HF Models comes with response like this

[
    {
        "generated_text": "Your Generated text"
    }
]

So we need to translate this using post_process_function

"post_process_function": "\n return params['response'][0].generated_text; \n"

I have this

[
    {
        "generated_text": "Your Generated text"
    }
]

and i want to convert it to this below using post process function

  {
       "completion": "Your Generated text"
   }

below is the post_process_function

"post_process_function": "\n def json = \"{\" +\n \"\\\"completion\\\":\\\"\" + params['response'][0].generated_text + \"\\\" }\";\n return json;\n "

But it never works with _predict APIs

ylwu-amzn commented 5 months ago

I think this issue could be fixed in 2.12. But before make that promise, I want to do some test to confirm first. Can you share some detail steps for reproducing your problem ? Can you share your create connector / model , and predict request ?

ramda1234786 commented 5 months ago

Here are the detailed steps @ylwu-amzn

Step :Created a Remote Model Connector successful and deployed the model
{
  "name": "google/flan-t5-large",
  "description": "google/flan-t5-large",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "endpoint": "api-inference.huggingface.co",
    "model": "google/flan-t5-large"
  },
  "credential": {
    "hf_key": "xxxxxxxxx"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "url": "https://${parameters.endpoint}/models/${parameters.model}",
      "headers": {
        "Authorization": "Bearer ${credential.hf_key}"
      },
       "request_body": "{ \"inputs\": \"${parameters.inputs}\" }"
    }
  ]
}

Other Steps : Registered the Model and Deployed the Model

Step: Did the predict and it worked successful /plugins/_ml/models/a3G3vowBu3N8jyGnP/_predict

{
  "parameters":{
    "inputs": "any text about the movie............"
}
}

Step : Created a RAG pipeline

{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "movie chat",
        "description": "Demo pipeline Using HF",
        "model_id": "a3G3vowBu3N8jyGnP",
        "context_field_list": ["text"]
      }
    }
  ]
}

Step : Did the Simple Search with RAG and it failed

{
    "query": {
        "match": {
            "vector_text": "What is the genere of rush movie?"
        }
    },
    "size": 1,
    "_source": ["text"],
    "ext": {
        "generative_qa_parameters": {
            "llm_model": "google/flan-t5-large", ## i have tried with "bedrock/flan-t5-large" as i got help from slack saying bedrock is hardcoded, with this it worked but RAG Failed
            "llm_question": "What is the genere of rush movie?",
            "conversation_id": "eHFSxIwBu3N8jyGnExM0",
            "context_size": 5,
            "interaction_size": 2,
            "timeout": 15
        }
    }
}

So that is the reason then used post_process_function to convert array into object and converting gennerated_text to completion as a json object. As the java code written in that way to achieve the RAG

ylwu-amzn commented 5 months ago

@ramda1234786 Thanks can you share examples of the model raw output and the expected output ?

ramda1234786 commented 5 months ago

Hi @ylwu-amzn , with predict API i am getting this response

{
    "inference_results": [
        {
            "output": [
                {
                    "name": "response",
                    "dataAsMap": {
                        "response": [
                            {
                                "generated_text": "I am fine....."
                            }
                        ]
                    }
                }
            ],
            "status_code": 200
        }
    ]
}

Expected Result is that without using post_process_function it should give the response in RAG search but RAG only works on key completion in json object,

so i have used the post_process_function to make it json object with completion key. Below is the expected result then it will work

{
    "inference_results": [
        {
            "output": [
                {
                    "name": "response",
                    "dataAsMap": {
                        "completion": "I'm fine."
                    }
                }
            ],
            "status_code": 200
        }
    ]
}

I have tried this but this below one but it is not working

"post_process_function": "\n def json = \"{\" +\n \"\\\"completion\\\":\\\"\" + params['response'][0].generated_text + \"\\\" }\";\n return json;"

austintlee commented 5 months ago

@ylwu-amzn If you try to use Painless to convert a JSON payload to another JSON, Java somehow can't seem to be able to recognize that output as JSON.

ylwu-amzn commented 5 months ago

Got it, @ramda1234786 , can you try this

"post_process_function": "\n    def name = 'response';\n    def result = params.result[0].generated_text;\n    def json = \"{\" +\n          '\"name\": \"' + name + '\",' +\n          '\"dataAsMap\": { \"completion\": \"' + result +\n          '\"}}';\n    return json;\n    "
ylwu-amzn commented 5 months ago

@ylwu-amzn If you try to use Painless to convert a JSON payload to another JSON, Java somehow can't seem to be able to recognize that output as JSON.

I think mostly that caused by a wrong painless. Do you have some example?

austintlee commented 5 months ago
{
  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "result = params.result[0].completion;\n    def ",
          "               ^---- HERE"
        ],
        "script": " ...",
        "lang": "painless",
        "position": {
          "offset": 51,
          "start": 36,
          "end": 82
        }
      }
    ],
    "type": "script_exception",
    "reason": "runtime error",
    "script_stack": [
      "result = params.result[0].completion;\n    def ",
      "               ^---- HERE"
    ],
    "script": " ...",
    "lang": "painless",
    "position": {
      "offset": 51,
      "start": 36,
      "end": 82
    },
    "caused_by": {
      "type": "null_pointer_exception",
      "reason": "Cannot invoke \"Object.getClass()\" because \"callArgs[0]\" is null"
    }
  },
  "status": 400
austintlee commented 5 months ago

@ylwu-amzn ^^

austintlee commented 5 months ago

Is it really params., not params['response']?

austintlee commented 5 months ago

The way I am testing is calling OpenAI and just rewriting the response.

austintlee commented 5 months ago
{
  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "result = params['response'][0].completion;\n    def ",
          "               ^---- HERE"
        ],
        "script": " ...",
        "lang": "painless",
        "position": {
          "offset": 51,
          "start": 36,
          "end": 87
        }
      }
    ],
    "type": "script_exception",
    "reason": "runtime error",
    "script_stack": [
      "result = params['response'][0].completion;\n    def ",
      "               ^---- HERE"
    ],
    "script": " ...",
    "lang": "painless",
    "position": {
      "offset": 51,
      "start": 36,
      "end": 87
    },
    "caused_by": {
      "type": "null_pointer_exception",
      "reason": "Cannot invoke \"Object.getClass()\" because \"callArgs[0]\" is null"
    }
  },
  "status": 400
}

well, that didn't work either.

ylwu-amzn commented 5 months ago

@austintlee , can you share the connector config , and the model raw output without post-process and the expected output you want with the post-process ? I can help do some testing to check if that's some bug or not.

ylwu-amzn commented 5 months ago

@ramda1234786 , have you tested this post process function ? https://github.com/opensearch-project/ml-commons/issues/1990#issuecomment-1928167932

ramda1234786 commented 5 months ago

Hi @ylwu-amzn i tried this script mentioned here

"post_process_function": "\n def name = 'response';\n def result = params.result[0].generated_text;\n def json = \"{\" +\n '\"name\": \"' + name + '\",' +\n '\"dataAsMap\": { \"completion\": \"' + result +\n '\"}}';\n return json;\n "

Something worked here, i am getting response here like this

{
    "inference_results": [
        {
            "output": [
                {
                    "name": "response",
                    "dataAsMap": {
                        "response": "{\"name\": \"response\",\"dataAsMap\": { \"completion\": \"Hi, I am fine today thanks for asking\n\n"}}"
                    }
                }
            ],
            "status_code": 200
        }
    ]
}

But the Expectation is to get this. But the painless script is really a pain

{
    "inference_results": [
        {
            "output": [
                {
                    "name": "response",
                    "dataAsMap": {
                        "completion": "I'm fine."
                    }
                }
            ],
            "status_code": 200
        }
    ]
}
ramda1234786 commented 5 months ago

Hi @ylwu-amzn even though i am not getting the expected output in _predict API, but my RAG search is working now, i am getting the response. So this is a great break through

But i am not sure why this extra stuff came of dataAsMap as "response": "{\"name\": \"response\",\"dataAsMap\": and still RAG is working, its a black box But if this is as expected , then there is no bug. Painless Script solved this.

{
    "inference_results": [
        {
            "output": [
                {
                    "name": "response",
                    "dataAsMap": {
                        "response": "{\"name\": \"response\",\"dataAsMap\": { \"completion\": \"Hi, I am fine today thanks for asking\n\n"}}"
                    }
                }
            ],
            "status_code": 200
        }
    ]
}
ylwu-amzn commented 5 months ago

@ramda1234786 That's caused by some escaping issue, you can find escape method in my PR https://github.com/opensearch-project/ml-commons/pull/2075

ramda1234786 commented 5 months ago

Hi @ylwu-amzn , i tested this and it worked perfectly. Thanks for this and we can close this