opensearch-project / ml-commons

ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related features within OpenSearch.
Apache License 2.0
88 stars 126 forks source link

[BUG] Invalid JSON in payload error despite sending a valid JSON with all required parameters #1872

Closed NeuralFlux closed 1 month ago

NeuralFlux commented 7 months ago

What is the bug? Sending a predict request to a model that uses SageMaker connector like so

POST /_plugins/_ml/models/_6gdD40BZqSAbrEiV6DT/_predict
{
  "parameters": {
    "inputs": "test sentence"
  }
}

produces an error

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Invalid JSON in payload"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Invalid JSON in payload"
  },
  "status": 400
}

despite having a complete and valid JSON.

How can one reproduce the bug? Steps to reproduce the behavior:

  1. Setup a connector to a SageMaker endpoint (any endpoint is okay since the error is in the connector itself)
  2. Deploy the model and note the model ID
  3. Predict curl -XPOST "http://localhost:9200/_plugins/_ml/models/<Model ID>/_predict" -H 'Content-Type: application/json' -d' { "parameters": { "inputs": "test sentence" } }'
  4. See error

What is the expected behavior? The request should trigger processing of request_body in the connector without any errors.

What is your host/environment?

Do you have any additional context? Passing "inputs": ["test sentence"] works. However, I need the embedding on just the sentence without the extra square brackets. Moreover, ${parameters.input}[0] works without any errors but gave different embedding for a test sentence than what was expected.

ramda1234786 commented 7 months ago

Hi @NeuralFlux works well for me. May be some problem in request_body while creating connector

It should be

"request_body": "{ \"inputs\": \"${parameters.inputs}\" }"

image

NeuralFlux commented 7 months ago

I realized a JSON has different standards for being "valid". May I know which standard is used to check the payload? Plain strings are valid JSON documents according to RFC 7159 and RFC 8259.

b4sjoo commented 7 months ago

Hi @NeuralFlux, could you please share your connector configuration?

NeuralFlux commented 7 months ago

Sure thing, it's

"actions": [
      {
         "action_type": "predict",
         "method": "POST",
         "headers": {
            "content-type": "application/x-text"
         },
         "url": "<INFERENCE_ENDPOINT>",
         "request_body": "${parameters.inputs}"
      }
   ]
mingshl commented 1 month ago

this might be resolved using model interface https://github.com/opensearch-project/ml-commons/issues/2354