Open yuye-aws opened 1 month ago
Hi @b4sjoo ! I understand you are quite busy in these days. Can you take a look at this issue when free? No need to hurry.
The expected behavior of the model interface feature should remain consistent no matter how the user predicts the ml model.
I am sorry but this is expected...The automated model interface only work when you strictly follow blueprint. If you changed something then you need update the model interface accordingly. In the future we will improve this, but at this time it still works in this way.
@yuye-aws Curious, what happens when you omit step 3? Is the request with prompt
successful?
@yuye-aws Curious, what happens when you omit step 3? Is the request with
prompt
successful?
Yes. It's successful.
@yuye-aws looks like this issue is resolved. I am going to close the issue, feel free to reopen it. @b4sjoo please raise new issue to fix this.
@yuye-aws looks like this issue is resolved. I am going to close the issue, feel free to reopen it. @b4sjoo please raise new issue to fix this.
Is there a PR fixing this issue?
@mingshl I found that I can still reproduce this issue with the 4 steps in the description, so I have to reopen this issue.
I also found this issue impacts the text embedding processor. The model interface blocks text embedding with the following steps:
Creates a titan embedding connector following blueprints
POST /_plugins/_ml/connectors/_create
{
"name": "Amazon Bedrock Connector: embedding",
"description": "The connector to bedrock Titan embedding model",
"version": 1,
"protocol": "aws_sigv4",
"parameters": {
"region": "us-west-2",
"service_name": "bedrock",
"model": "amazon.titan-embed-text-v1"
},
"credential": {
"access_key": "...",
"secret_key": "..."
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/${parameters.model}/invoke",
"headers": {
"content-type": "application/json",
"x-amz-content-sha256": "required"
},
"request_body": "{ \"inputText\": \"${parameters.inputText}\" }",
"pre_process_function": "connector.pre_process.bedrock.embedding",
"post_process_function": "connector.post_process.bedrock.embedding"
}
]
}
Creates a model from the connector
POST /_plugins/_ml/models/_register
{
"name": "anthropic.claude-v2",
"function_name": "remote",
"description": "test model",
"connector_id": "LhNR_5IB5-TXIyuBmqPe"
}
Test the embedding model. This step should be ok.
POST /_plugins/_ml/models/MRNT_5IB5-TXIyuBQqOe/_predict/
{
"parameters": {
"inputText": "hello, who are you?"
}
}
Create an ingest pipeline with text embedding processor.
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "A text embedding pipeline",
"processors": [
{
"text_embedding": {
"model_id": "MRNT_5IB5-TXIyuBQqOe",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
Test the ingest pipeline. And then you will receive the error Error validating input schema
POST _ingest/pipeline/nlp-ingest-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source":{
"passage_text": "hello world"
}
}
]
}
I'm currently testing out another bug and I am facing this same issue with the text embedding processor. It fails on v1 with ingest pipeline but works with v2.
"cause": {
"type": "status_exception",
"reason": """Error validating input schema: Validation failed: [$: required property 'parameters' not found] for instance: {"algorithm":"REMOTE","text_docs":["Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky."],"return_bytes":false,"return_number":true,"target_response":["sentence_embedding"]} with schema: {
"type": "object",
"properties": {
"parameters": {
"type": "object",
"properties": {
"inputText": {
"type": "string"
}
},
"required": [
"inputText"
]
}
},
"required": [
"parameters"
]
}"""
},
"status": 400
}
I've tried to modify the input text field to be inputText, but still gets failed
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "A text embedding pipeline",
"processors": [
{
"text_embedding": {
"model_id": "MRNT_5IB5-TXIyuBQqOe",
"field_map": {
"inputText": "inputText_embedding"
}
}
}
]
}
POST _ingest/pipeline/nlp-ingest-pipeline/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source":{
"inputText": "hello world"
}
}
]
}
I'm currently testing out another bug and I am facing this same issue with the text embedding processor. It fails on v1 with ingest pipeline but works with v2.
+1. I meet with exactly the same problem just like you.
@b4sjoo Can you prioritize this issue ? Sounds a code bug
Ack, will take a look
I've figured out a workaround. For me, there's no rush to resolve this issue. Take your time @b4sjoo.
I've figured out a workaround. For me, there's no rush to resolve this issue. Take your time @b4sjoo.
Can you share if possible?
I guess it's just to update the interface field into an empty interface?
I've figured out a workaround. For me, there's no rush to resolve this issue. Take your time @b4sjoo.
Can you share if possible?
The workaround is not declaring the model
in the parameters
. You need to modify the connector blueprint into:
POST /_plugins/_ml/connectors/_create
{
"name": "Amazon Bedrock Connector: embedding",
"description": "The connector to bedrock Titan embedding model",
"version": 1,
"protocol": "aws_sigv4",
"parameters": {
"region": "us-west-2",
"service_name": "bedrock"
},
"credential": {
"access_key": "...",
"secret_key": "..."
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/amazon.titan-embed-text-v1/invoke",
"headers": {
"content-type": "application/json",
"x-amz-content-sha256": "required"
},
"request_body": "{ \"inputText\": \"${parameters.inputText}\" }",
"pre_process_function": "connector.pre_process.bedrock.embedding",
"post_process_function": "connector.post_process.bedrock.embedding"
}
]
}
I guess it's just to update the interface field into an empty interface?
Can you please elaborate more? How can an empty interface work?
@yuye-aws Previously missed replying this. So you can simply update the model_interface field to an empty object "model_interface": {}
or "model_interface": {"input":{},"output":{}}
, if previous not working. Essentially not declaring the model
is same to this solution, because it will disable automated generation of model interface due to no model found.
@yuye-aws Previously missed replying this. So you can simply update the model_interface field to an empty object
"model_interface": {}
or"model_interface": {"input":{},"output":{}}
, if previous not working. Essentially not declaring themodel
is same to this solution, because it will disable automated generation of model interface due to no model found.
I understand the workaround works, but for customers without enough context. If they are copying connector blueprints and perform text embedding, the model interface error would be super confusing to them.
What is the bug? When predict a ml model, even if the connector payload is filled, I still receives the error from model interface.
How can one reproduce the bug?
${parameters.inputs}
to${parameters.prompt}
.inputs
parameter. You'll receive connector payload validation error sinceprompt
parameter is needed. This is correct.prompt
parameter. You'll receive model interface error.What is the expected behavior?
I should receive the model response as long as the connector payload is filled. I also find it quite weird that the bug won't be reproduced if I omit step 3.
What is your host/environment?
Do you have any screenshots? If applicable, add screenshots to help explain your problem.
Do you have any additional context? Add any other context about the problem.