Closed ramda1234786 closed 5 months ago
I think this issue could be fixed in 2.12. But before make that promise, I want to do some test to confirm first. Can you share some detail steps for reproducing your problem ? Can you share your create connector / model , and predict request ?
Here are the detailed steps @ylwu-amzn
Step :Created a Remote Model Connector successful and deployed the model
{
"name": "google/flan-t5-large",
"description": "google/flan-t5-large",
"version": 1,
"protocol": "http",
"parameters": {
"endpoint": "api-inference.huggingface.co",
"model": "google/flan-t5-large"
},
"credential": {
"hf_key": "xxxxxxxxx"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://${parameters.endpoint}/models/${parameters.model}",
"headers": {
"Authorization": "Bearer ${credential.hf_key}"
},
"request_body": "{ \"inputs\": \"${parameters.inputs}\" }"
}
]
}
Other Steps : Registered the Model and Deployed the Model
Step: Did the predict and it worked successful /plugins/_ml/models/a3G3vowBu3N8jyGnP/_predict
{
"parameters":{
"inputs": "any text about the movie............"
}
}
Step : Created a RAG pipeline
{
"response_processors": [
{
"retrieval_augmented_generation": {
"tag": "movie chat",
"description": "Demo pipeline Using HF",
"model_id": "a3G3vowBu3N8jyGnP",
"context_field_list": ["text"]
}
}
]
}
Step : Did the Simple Search with RAG and it failed
{
"query": {
"match": {
"vector_text": "What is the genere of rush movie?"
}
},
"size": 1,
"_source": ["text"],
"ext": {
"generative_qa_parameters": {
"llm_model": "google/flan-t5-large", ## i have tried with "bedrock/flan-t5-large" as i got help from slack saying bedrock is hardcoded, with this it worked but RAG Failed
"llm_question": "What is the genere of rush movie?",
"conversation_id": "eHFSxIwBu3N8jyGnExM0",
"context_size": 5,
"interaction_size": 2,
"timeout": 15
}
}
}
So that is the reason then used post_process_function
to convert array into object and converting gennerated_text
to completion
as a json object. As the java code written in that way to achieve the RAG
@ramda1234786 Thanks can you share examples of the model raw output and the expected output ?
Hi @ylwu-amzn , with predict API i am getting this response
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"response": [
{
"generated_text": "I am fine....."
}
]
}
}
],
"status_code": 200
}
]
}
Expected Result is that without using post_process_function
it should give the response in RAG search but RAG only works on key completion
in json object,
so i have used the post_process_function
to make it json object with completion
key. Below is the expected result then it will work
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"completion": "I'm fine."
}
}
],
"status_code": 200
}
]
}
I have tried this but this below one but it is not working
"post_process_function": "\n def json = \"{\" +\n \"\\\"completion\\\":\\\"\" + params['response'][0].generated_text + \"\\\" }\";\n return json;"
@ylwu-amzn If you try to use Painless to convert a JSON payload to another JSON, Java somehow can't seem to be able to recognize that output as JSON.
Got it, @ramda1234786 , can you try this
"post_process_function": "\n def name = 'response';\n def result = params.result[0].generated_text;\n def json = \"{\" +\n '\"name\": \"' + name + '\",' +\n '\"dataAsMap\": { \"completion\": \"' + result +\n '\"}}';\n return json;\n "
@ylwu-amzn If you try to use Painless to convert a JSON payload to another JSON, Java somehow can't seem to be able to recognize that output as JSON.
I think mostly that caused by a wrong painless. Do you have some example?
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"result = params.result[0].completion;\n def ",
" ^---- HERE"
],
"script": " ...",
"lang": "painless",
"position": {
"offset": 51,
"start": 36,
"end": 82
}
}
],
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"result = params.result[0].completion;\n def ",
" ^---- HERE"
],
"script": " ...",
"lang": "painless",
"position": {
"offset": 51,
"start": 36,
"end": 82
},
"caused_by": {
"type": "null_pointer_exception",
"reason": "Cannot invoke \"Object.getClass()\" because \"callArgs[0]\" is null"
}
},
"status": 400
@ylwu-amzn ^^
Is it really params.
, not params['response']
?
The way I am testing is calling OpenAI and just rewriting the response.
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"result = params['response'][0].completion;\n def ",
" ^---- HERE"
],
"script": " ...",
"lang": "painless",
"position": {
"offset": 51,
"start": 36,
"end": 87
}
}
],
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"result = params['response'][0].completion;\n def ",
" ^---- HERE"
],
"script": " ...",
"lang": "painless",
"position": {
"offset": 51,
"start": 36,
"end": 87
},
"caused_by": {
"type": "null_pointer_exception",
"reason": "Cannot invoke \"Object.getClass()\" because \"callArgs[0]\" is null"
}
},
"status": 400
}
well, that didn't work either.
@austintlee , can you share the connector config , and the model raw output without post-process and the expected output you want with the post-process ? I can help do some testing to check if that's some bug or not.
@ramda1234786 , have you tested this post process function ? https://github.com/opensearch-project/ml-commons/issues/1990#issuecomment-1928167932
Hi @ylwu-amzn i tried this script mentioned here
"post_process_function": "\n def name = 'response';\n def result = params.result[0].generated_text;\n def json = \"{\" +\n '\"name\": \"' + name + '\",' +\n '\"dataAsMap\": { \"completion\": \"' + result +\n '\"}}';\n return json;\n "
Something worked here, i am getting response here like this
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"response": "{\"name\": \"response\",\"dataAsMap\": { \"completion\": \"Hi, I am fine today thanks for asking\n\n"}}"
}
}
],
"status_code": 200
}
]
}
But the Expectation is to get this. But the painless script is really a pain
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"completion": "I'm fine."
}
}
],
"status_code": 200
}
]
}
Hi @ylwu-amzn even though i am not getting the expected output in _predict API, but my RAG search is working now, i am getting the response. So this is a great break through
But i am not sure why this extra stuff came of dataAsMap
as "response": "{\"name\": \"response\",\"dataAsMap\":
and still RAG is working, its a black box
But if this is as expected , then there is no bug. Painless Script solved this.
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"response": "{\"name\": \"response\",\"dataAsMap\": { \"completion\": \"Hi, I am fine today thanks for asking\n\n"}}"
}
}
],
"status_code": 200
}
]
}
@ramda1234786 That's caused by some escaping issue, you can find escape method in my PR https://github.com/opensearch-project/ml-commons/pull/2075
Hi @ylwu-amzn , i tested this and it worked perfectly. Thanks for this and we can close this
What is the bug? post_process_function of painless script does not work while creating externally hosted models using http. This shows up in predict API
How can one reproduce the bug? Take any text2text generation inference API endpoint model from Hugging Face. Create http request and add below mentioned
post_process_function
but it does not work as expected.The HF Models comes with response like this
So we need to translate this using
post_process_function
"post_process_function": "\n return params['response'][0].generated_text; \n"
I have this
and i want to convert it to this below using post process function
below is the
post_process_function
"post_process_function": "\n def json = \"{\" +\n \"\\\"completion\\\":\\\"\" + params['response'][0].generated_text + \"\\\" }\";\n return json;\n "
But it never works with _predict APIs