autodeployai / ai-serving

Serving AI/ML models in the open standard formats PMML and ONNX with both HTTP (REST API) and gRPC endpoints
Apache License 2.0
144 stars 31 forks source link

Error with Prediction from TensorFlow Object Detection ONNX #9

Open ansarisam opened 2 years ago

ansarisam commented 2 years ago

I am getting an error when trying to predict from ONNX model (TensorFlow based Object Detection). When I call the api-serving Rest API, I get the following error b'{"error":"Shape [640, 640, 3], requires 1228800 elements but the buffer has 9830400 elements."}' Here is the code that is creating the input and calling the Rest API.

import numpy as np
import requests
import onnx_ml_pb2
from ai_serving_pb2 import RecordSpec, Record, PredictRequest, Value
from PIL import Image

port = 6222
base_url = 'http://localhost:' + str(port)
model_name = "objectmodel"
prediction_url = base_url + '/v1/models/' + model_name + '/versions/1'
print("prediction_url=", prediction_url)
image_path = "/content/sampleimage.jpg"

input_tensors = []
input_arrays = []
output_tensors = []
output_arrays = []
output_digits = []

# load image
image = Image.open(image_path)
image = image.resize((640, 640), Image.ANTIALIAS)
image = np.array(image).astype('float')

# create tensor
my_tensor = onnx_ml_pb2.TensorProto()
my_tensor.dims.extend(image.shape)
my_tensor.data_type = 1
my_tensor.raw_data = image.tobytes()

input_tensors.append(my_tensor)

# predict
headers = {'Content-Type': 'application/x-protobuf'}

# Create an instance of RecordSpec using records that contains only the first tensor.
request_message_records = PredictRequest(X=RecordSpec(
    records=[Record(fields={'input_tensor': Value(tensor_value=input_tensors[0])})]
))

# Make prediction for the records request message.
prediction_response_records = requests.post(prediction_url, headers=headers,
                                            data=request_message_records.SerializeToString())
print(prediction_response_records.content)
scorebot commented 2 years ago

@ansarisam I think the model expects to receive the data in byte, not float, you can try to replace the line

image = np.array(image).astype('float')

by

image = np.array(image).astype(np.byte)
ansarisam commented 2 years ago

Thanks for the reply. Changing to image = np.array(image).astype(np.byte) did not solve the problem. The new error now is: b'{"error":"Error code - ORT_INVALID_ARGUMENT - message: Unexpected input data type. Actual: (N11onnxruntime17PrimitiveDataTypeIaEE) , expected: (N11onnxruntime17PrimitiveDataTypeIhEE)"}'

scorebot commented 2 years ago

@ansarisam There is a thread (https://github.com/microsoft/onnxruntime/issues/6261) discussing the issue, currently ai-serving depends on the onnxruntime 1.6.0 that does not support UINT8 yet in the Java API. We need to pick up the latest onnxruntime that should include the commit (https://github.com/microsoft/onnxruntime/pull/8401), it's not a simple task, we need more time to finish them.

BTW, is there an ONNX model that I can reproduce the error above? so that I can verify it after the update done

scorebot commented 2 years ago

@ansarisam Could you try the latest code of ai-serving? which can support the uint8 now, invoke the following code to convert to uint8:

image = np.array(image).astype('uint8')

NOTE: the new docker images with the fix are not ready yet, we're working on them. You need to compile the code to try. Please let me know if you have any problems.