onnx / tutorials

Tutorials for creating and using ONNX models
Apache License 2.0
3.39k stars 629 forks source link

ONNXRunTime server hosting of custom ONNX model setup question #211

Open quantum-fusion opened 4 years ago

quantum-fusion commented 4 years ago

Bug Report

If the model conversion is failing for a tutorial in this repo, report the bug here. However, if the bug is related to general model conversion, please go to the appropriate converter repo.

Describe the bug

Please describe the bug clearly and concisely.

I created a custom ONNX model called ONNX.model. (https://www.dropbox.com/s/23tkd88bzeqk3uy/model.tar?dl=0)

I wish to run it on the onnxruntime server, and then show it a picture kitten.jpg , like in this example that was a juypter notebook (https://github.com/onnx/tutorials/blob/master/tutorials/OnnxRuntimeServerSSDModel.ipynb)

sudo docker run -it -v $(pwd):$(pwd) -p 9001:8001 mcr.microsoft.com/onnxruntime/server --model_path $(pwd)/model.onnx

curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg

curl -X POST http://127.0.0.1:9001/v1/models/default/versions/1:predict/model -T kitten.jpg

This is the error message that I get back from the REST POST of the kitten picture. {"error_code": 400, "error_message": "Missing or unknown 'Content-Type' header field in the request"}

What am I missing for the model hosting with ONNX Runtime Server ?

snnn commented 4 years ago

Please follow the tutorial. The server can't accept an image directly.

quantum-fusion commented 4 years ago

@snnn I followed the tutorial, but need to understand why the REST API does not accept a picture as an input. Isn't this the whole point? If it can not accept a picture, then it doesn't function as expected.

snnn commented 4 years ago

Every image model has a pre-processing step, you can't have a generic model server that works with every model and every preprocessing function. So, before submitting the image to the model server, you need to process it locally.

You can have you own server that accept a picture post directly, but you need to write some code. You may use ONNX Runtime's python/C#/java/C/C++ API for doing this, you can also use ONNX's python backend API, then put it in a web backend like apache/jetty/django.

quantum-fusion commented 4 years ago

@snnn I am trying to use an ONNX model created using the https://github.com/onnx/tensorflow-onnx API backed by Microsoft.

The problem is that there isn't any method to create the params and sym files that is required by the Multi-model-server back by AWS (https://github.com/awslabs/multi-model-server/issues/936).

In addition, I can not get any pictures presented to your REST API similar to the multi-model-server API interface.

Therefore, I am having trouble using your ONNX model hosting service for the ONNXRunTime server. Do you think it is usable, and if so, how, can you please provide an example?

quantum-fusion commented 4 years ago

Look at how the multi-model-server works. (https://github.com/awslabs/multi-model-server/blob/master/model-archiver/docs/convert_from_onnx.md)

curl -X POST http://127.0.0.1:8080/predictions/squeezenet -T docs/images/kitten_small.jpg

quantum-fusion commented 4 years ago

Do you think that the ONNXRunTimeServer can provide the same interface for a REST POST to the /predict function?

snnn commented 4 years ago

No, ONNX RunTime Server can't. I'm a serious person, I do my job seriously. If the model server can't get exactly the same accuracy as the original model, I think it is wrong. I know how many years a scientist must spent to improve 0.1% prediction accuracy, I would not allow me to ruin their effort. I can't give you something that pretends working for every model, but indeed not.

ONNX Runtime is a low level engine. To get real use on it, you need to write your application specific code, for example, how to decode/resize an image. And such code is not generic, it is model specific. For example, you can take a look at : https://github.com/tensorflow/models/tree/master/research/slim/preprocessing , these files are just a tip of an iceberg. You can't copy all such code in one application to support all image models in this world.

And if you take a look at https://github.com/onnx/models , In the README files, every model there has a section of "preprocessing", which is the text description of what you should do before feeding your raw data into the model. It is possible to create demo application that works for a few models, but it is not possible to support all ONNX models in one application.

quantum-fusion commented 4 years ago

Thank you for your reply. I have been benchmarking BentoML(https://github.com/bentoml/BentoML) with ResNet50 as compared to the Multi-model-server with SqueezeNet. The problem I was hoping to solve with the ONNXRunTime was that the BentoML requires that I pre-process the input .jpg files to 224x224 which reduces the accuracy, by more than I can afford. While the Multi-model-server with SqueezeNet use case is attractive, and has ease of use with a REST API, it still has a reduced accuracy as compared to the Azure Computer Vision (azure.microsoft.com/en-us/services/cognitive-services/computer-vision)

I will have to take a look at your examples and see if the pre-processing will allow the REST API to work with the curl command with POST of a picture. At this point, the only service that does that is the multi-model-server , but they do not accept custom ONNX models. This is a huge disappointment.

The GCP Computer Vision and Azure computer vision all accept input .jpeg files but do not allow custom ONNX models because their models are pre-trained neural nets. To date, I see only the Microsoft Azure computer vision model as an acceptable level of accuracy currently.

thanks for your input and suggestions, because they are appreciated.