AI-Serving not supporting ONNX models with dynamic axes?

autodeployai / ai-serving

Serving AI/ML models in the open standard formats PMML and ONNX with both HTTP (REST API) and gRPC endpoints

Apache License 2.0

144 stars 31 forks source link

AI-Serving not supporting ONNX models with dynamic axes? #2

Closed ajertec closed 4 years ago

ajertec commented 4 years ago

Hello, I got AI Serving Server up and running:

pulled autodeployai/ai-serving:0.9.0-cuda image,
started server: docker run --rm -it -v $(pwd):/opt/ai-serving -p 9090:9090 -p 9091:9091 IMAGE_ID
tried out mnist example (AIServingMnistOnnxModel.ipynb notebook) and it works (on cpu).

However, when I load my custom model with dynamic axes (batch axis, or some other) I get Response 500 (Internal Server Error). My models are pytorch models converted to .onnx with torch.onnx.export function.

I also got this error in terminal:

(here batch size axis is static, seconda axis is dynamic)

Does this mean that AI-Serving is not supporting dynamic axes, and when will this feature be available?

Thank you.

scorebot commented 4 years ago

@ajertec It seems this error above is thrown by the ONNXRuntime backend. Could you input more information? for example model metadata, and your input payload.

scorebot commented 4 years ago

@ajertec Could you input more info here? so that we can reproduce your issue.

ajertec commented 4 years ago

@scorebot I used PyTorch function torch.onnx.export for converting pytorch model into .onnx format. One of the parameters is dynamic_axes which enables .onnx model to have dynamic axes. I used dynamic_axes={0:"batch"} to set the batch size dynamic.

Locally (with onnruntime-gpu) I can run inference on the model with various batch sizes and it works. However when I deploy the model on AI-Serving server and send request messages I get this error in terminal: (Using ai-serving:0.9.1-cuda docker image.)

My models metadata:

scorebot commented 4 years ago

@ajertec AI-Serving has supported dynamic axes. Please, download the latest code and build it to try, and let me know if fix the issue above.

scorebot commented 4 years ago

@ajertec Did the fix work for you? Did you have a problem to build it? Please, let me know if you have any issues.

ajertec commented 4 years ago

@scorebot
I pulled the latest code, and ran command to install from source sbt -Dgpu=true clean assembly. 2 unit tests fail:

It seems there is some issue with json payload. ( I am using protobuf anyway)

If I skip unittests with sbt -Dgpu=true 'set test in assembly := {}' clean assembly, start server, deploy models with dynamic axis and run inference on them everything works fine. Thank you.

scorebot commented 4 years ago

@ajertec Thanks for your info. The failed test cases are caused by the double-precision, they are fine, and they passed against the CPU backend, you can ignore them.

scorebot commented 4 years ago

Fixed in the docker images autodeployai/ai-serving:0.9.2 and autodeployai/ai-serving:0.9.2-cuda, close it now.