autodeployai / ai-serving

Serving AI/ML models in the open standard formats PMML and ONNX with both HTTP (REST API) and gRPC endpoints
Apache License 2.0
144 stars 31 forks source link

AI Serving ONNX model install and ONNX Runtime library missing #1

Closed quantum-fusion closed 4 years ago

quantum-fusion commented 4 years ago

Hello, I got the AI Serving Server up and running, however when I install the ONNX Resnet50 model, the curl command fails, and can not find the Onnx Runtime extension dependencies.

Do you have any idea where these dependencies are, and how to install them? I didn't see that in the build instructions.

curl -X PUT --data-binary @./resnet50v2/resnet50-v2-7.onnx -H "Content-Type: application/octet-stream" http://localhost:9090/v1/validate

{"error":"Onnx Runtime initialization failed: no onnxruntime in java.library.path: [/Users/h/torch/install/lib, ., /Users/h/Library/Java/Extensions, /Library/Ja((base) MacBoo(base) MacBook-Pro:~/ai-servi(base) MacBook-Pro:~/ai-se(base) MacBook-Pro:~(ba(base(bas(((ba(base) Mac(base) Ma((base(((b(ba((b(base) M(base) Mac(ba(ba((ba(b(ba(bas((base) MacBook-Pro:~/ai-serving-master quantum-fusion$

scorebot commented 4 years ago

@quantum-fusion Please, refer to here: https://github.com/autodeployai/ai-serving#onnx-runtime-configuration, which describes how to config ONNX runtime in the AI-Serving.

We're developing a new version to pick up the ONNX runtime from the maven central since ONNX runtime has been distributed to the central maven repository. The new version will be released soon.

Currently, you could either build the ONNX runtime by yourself or use the ai-serving docker docker pull autodeployai/ai-serving.

quantum-fusion commented 4 years ago

@scorebot It would be great, if you can announce to me when the new version is available that picks up the dependency from the maven repository.

I say this, because I had already tried the Microsoft build instructions, but it failed to execute on MAC. See the bug report here: ONNXRuntime fails to build on MAC https://github.com/microsoft/onnxruntime/issues/4735

scorebot commented 4 years ago

@quantum-fusion Please, update the ai-serving repository that can pick up the ONNX Runtime binaries from maven central. Please, let me know if you have any problems.

quantum-fusion commented 4 years ago

@scorebot This is the transcript of what happened. There were some WARN messages. Is this what you were expecting?

(venv) (base) MacBook-Pro:~/ai-serving quantum-fusion$ sbt clean assembly [info] Loading settings for project ai-serving-build from plugins.sbt ... [info] Loading project definition from /Users/hottelet/ai-serving/project [warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings. [info] Loading settings for project ai-serving from build.sbt ... [info] Set current project to ai-serving (in build file:/Users/hottelet/ai-serving/) [success] Total time: 5 s, completed Aug 11, 2020, 10:58:02 AM [info] Updating https://repo1.maven.org/maven2/com/microsoft/onnxruntime/onnxruntime/1.4.0/onnxruntime-1.4.0.pom 100.0% [##########] 1.6 KiB (1.3 KiB / s) https://repo1.maven.org/maven2/org/pmml4s/pmml4s_2.13/0.9.7/pmml4s_2.13-0.9.7.pom 100.0% [##########] 2.5 KiB (2.0 KiB / s) [info] Resolved dependencies [info] Fetching artifacts of https://repo1.maven.org/maven2/org/pmml4s/pmml4s_2.13/0.9.7/pmml4s_2.13-0.9.7.jar 100.0% [##########] 1.9 MiB (234.0 KiB / s) https://repo1.maven.org/maven2/com/microsoft/onnxruntime/onnxruntime/1.4.0/onnxruntime-1.4.0.jar 100.0% [##########] 81.0 MiB (3.9 MiB / s) [info] Fetched artifacts of [warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings. [info] Compiling 2 protobuf files to /Users/hottelet/ai-serving/target/scala-2.13/src_managed/main [info] Compiling schema /Users/hottelet/ai-serving/src/main/protobuf/ai-serving.proto [info] Compiling schema /Users/hottelet/ai-serving/src/main/protobuf/onnx-ml.proto protoc-jar: protoc version: 3.11.4, detected platform: osx-x86_64 (mac os x/x86_64) protoc-jar: java.io.IOException: java.io.FileNotFoundException: /Users/hottelet/.m2/settings.xml (No such file or directory) protoc-jar: cached: /var/folders/60/l9x5s_456sx9nspbd1l90_fm0000gn/T/protocjar.webcache/com/google/protobuf/protoc/maven-metadata.xml protoc-jar: cached: /var/folders/60/l9x5s_456sx9nspbd1l90_fm0000gn/T/protocjar.webcache/com/google/protobuf/protoc/3.11.4/protoc-3.11.4-osx-x86_64.exe protoc-jar: executing: [/var/folders/60/l9x5s_456sx9nspbd1l90_fm0000gn/T/protocjar2309765043864006959/bin/protoc.exe, --plugin=protoc-gen-scala_0=/var/folders/60/l9x5s_456sx9nspbd1l90_fm0000gn/T/protocbridge4367480704088731231, --scala_0_out=flat_package,grpc:/Users/hottelet/ai-serving/target/scala-2.13/src_managed/main, -I/Users/hottelet/ai-serving/src/main/protobuf, -I/Users/hottelet/ai-serving/target/protobuf_external, -I/Users/hottelet/ai-serving/src/main/protobuf, -I/Users/hottelet/ai-serving/target/protobuf_external, /Users/hottelet/ai-serving/src/main/protobuf/ai-serving.proto, /Users/hottelet/ai-serving/src/main/protobuf/onnx-ml.proto] [info] Compiling protobuf [info] Protoc target directory: /Users/hottelet/ai-serving/target/scala-2.13/src_managed/main [info] Compiling 54 Scala sources to /Users/hottelet/ai-serving/target/scala-2.13/classes ... [info] Compiling 9 Scala sources to /Users/hottelet/ai-serving/target/scala-2.13/test-classes ... 10:58:44.392 INFO ai-autodeploy-serving-ServerSpec-akka.actor.default-dispatcher-5 akka.event.slf4j.Slf4jLogger Slf4jLogger started 10:58:44.506 INFO AI-Serving-akka.actor.default-dispatcher-5 akka.event.slf4j.Slf4jLogger Slf4jLogger started 10:58:44.530 INFO pool-19-thread-11 ai.autodeploy.serving.AIServer$ Predicting thread pool size: 16 10:58:44.933 INFO pool-19-thread-11-ScalaTest-running-ServerSpec a.a.serving.deploy.ModelManager$ Service home located: /var/folders/60/l9x5s_456sx9nspbd1l90_fm0000gn/T/ai-serving-test-7957435196051561081 10:58:44.992 WARN ai-autodeploy-serving-ServerSpec-akka.actor.default-dispatcher-6 a.a.serving.errors.ErrorHandler$ Model 'not-exist-model' not found 10:58:45.044 WARN pool-19-thread-11-ScalaTest-running-ServerSpec a.a.serving.errors.ErrorHandler$ Prediction request takes unknown content type: none/none 10:58:45.057 WARN ai-autodeploy-serving-ServerSpec-akka.actor.default-dispatcher-6 a.a.serving.errors.ErrorHandler$ Model 'not-exist-model' not found 10:58:45.063 WARN ai-autodeploy-serving-ServerSpec-akka.actor.default-dispatcher-6 a.a.serving.errors.ErrorHandler$ Model 'not-exist-model' not found [info] ServerSpec: [info] The HTTP service [info] - should return an OK status response for GET requests to /up [info] - should return a 404 error with a model that does not exist for GET requests to /v1/models/not-exist-model [info] - should return a 400 error for POST requests to /v1/models/not-exist-model without header field Content-Type [info] - should return a 404 error with a model that does not exist for POST requests to /v1/models/not-exist-model [info] - should return a 404 error with a model that does not exist for DELETE requests to /v1/models/not-exist-model [info] UtilsSpec: [info] Utils [info] - can shapeOfValue [info] DataUtils [info] - can anyToFloat 10:58:45.210 INFO ai-autodeploy-serving-OnnxHttpSpec-akka.actor.default-dispatcher-5 akka.event.slf4j.Slf4jLogger Slf4jLogger started [info] OnnxHttpSpec: [info] The HTTP service [info] - should return a validation response for POST requests to /v1/validate [info] - should return a prediction response for POST requests to /v1/models/${MODEL_NAME}/versions/${MODEL_VERSION} using json payload in records: list like [{column -> value}, … , {column -> value}] [info] - should return a prediction response for POST requests to /v1/models/${MODEL_NAME}/versions/${MODEL_VERSION} using json payload in split: dict like {‘columns’ -> [columns], ‘data’ -> [values]} [info] - should return a prediction response for POST requests to /v1/models/${MODEL_NAME}/versions/${MODEL_VERSION} using binary protobuf payload in records: list like [{column -> value}, … , {column -> value}] [info] - should return a prediction response for POST requests to /v1/models/${MODEL_NAME}/versions/${MODEL_VERSION} using binary protobuf payload in split: dict like {‘columns’ -> [columns], ‘data’ -> [values]} 10:58:46.238 INFO ai-autodeploy-serving-PmmlHttpSpec-akka.actor.default-dispatcher-6 akka.event.slf4j.Slf4jLogger Slf4jLogger started [info] PmmlHttpSpec: [info] The HTTP service [info] - should return a validation response for POST requests to /v1/validate [info] - should return a prediction response for POST requests to /v1/models/${MODEL_NAME}/versions/${MODEL_VERSION} using payload in records: list like [{column -> value}, … , {column -> value}] [info] - should return a prediction response for POST requests to /v1/models/${MODEL_NAME} using payload in split: dict like {‘columns’ -> [columns], ‘data’ -> [values]} [info] - should return a prediction response for POST requests to /v1/models/${MODEL_NAME}/versions/${MODEL_VERSION} using payload in records: list like [{column -> value}, … , {column -> value}] with output filter [info] - should return a model metadata response with specified version for GET requests to /v1/models/${MODEL_NAME}/versions/${MODEL_VERSION} [info] - should return a model metadata response with all versions for GET requests to /v1/models/${MODEL_NAME} [info] - should return all models metadata response for GET requests to /v1/models [info] PmmlGrpcSpec: [info] The GRPC service [info] - should return a validation response for calling 'validate' [info] - should return a prediction response for calling 'predict' using payload in records: list like [{column -> value}, … , {column -> value}] [info] - should return a prediction response for calling 'predict' using payload in split: dict like {‘columns’ -> [columns], ‘data’ -> [values]} [info] - should return a model metadata response for calling 'loadModelMetadataWithVersion' with the specified model and version [info] - should return a model metadata response for calling 'loadModelMetadataWithVersion' with the specified model [info] - should return all models metadata response for calling 'loadModelMetadataWithVersion' without a model [info] OnnxGrpcSpec: [info] The GRPC service [info] - should return a validation response for calling 'validate' [info] - should return a prediction response for calling 'predict' using payload in records: list like [{column -> value}, … , {column -> value}] [info] - should return a prediction response for calling 'predict' using payload in split: dict like {‘columns’ -> [columns], ‘data’ -> [values]} [info] Run completed in 3 seconds, 693 milliseconds. [info] Total number of tests run: 28 [info] Suites: completed 6, aborted 0 [info] Tests: succeeded 28, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. [info] Strategy 'concat' was applied to a file (Run the task at debug level to see details) [info] Strategy 'discard' was applied to 57 files (Run the task at debug level to see details) [info] Strategy 'last' was applied to 13 files (Run the task at debug level to see details) [info] Strategy 'rename' was applied to 4 files (Run the task at debug level to see details) [success] Total time: 68 s (01:08), completed Aug 11, 2020, 10:59:10 AM

quantum-fusion commented 4 years ago

@scorebot This is the generalized command, but I don't know where the ONNXruntimeJNI jar is located.

@scorebot java -Donnxruntime.native.onnxruntime4j_jni.path=/path/to/onnxruntime4j_jni -Donnxruntime.native.onnxruntime.path=/path/to/onnxruntime -jar ai-serving-assembly-.jar

scorebot commented 4 years ago

@quantum-fusion The warnings above are expected, they are just logs of unit tests that run some error cases.

You do NOT need to set both properties onnxruntime.native.onnxruntime4j_jni.path and onnxruntime.native.onnxruntime.path, which are only used when you want to build the ONNX Runtime by yourself and replace the default binaries.

See the topic for details: https://github.com/autodeployai/ai-serving#onnx

quantum-fusion commented 4 years ago

@scorebot I executed the ai-serving server and this is the results from the console. java -jar ./target/scala-2.13/ai-serving-assembly-0.9.0.jar 11:16:56.778 INFO AI-Serving-akka.actor.default-dispatcher-5 akka.event.slf4j.Slf4jLogger Slf4jLogger started 11:16:56.828 INFO main ai.autodeploy.serving.AIServer$ Predicting thread pool size: 16 11:17:02.102 INFO main a.autodeploy.serving.protobuf.GrpcServer AI-Serving grpc server started, listening on 9091 11:17:02.600 INFO main ai.autodeploy.serving.AIServer$ AI-Serving http server started, listening on http://0.0.0.0:9090/ 11:23:40.340 INFO AI-Serving-akka.ai-predicting-dispatcher-18 a.a.serving.deploy.ModelManager$ Service home located: /opt/ai-serving 11:23:43.996 ERROR AI-Serving-akka.ai-predicting-dispatcher-18 a.a.serving.errors.ErrorHandler$ /opt/ai-serving 11:24:03.037 ERROR AI-Serving-akka.ai-predicting-dispatcher-24 a.a.serving.errors.ErrorHandler$ /opt/ai-serving 11:24:39.389 ERROR AI-Serving-akka.ai-predicting-dispatcher-30 a.a.serving.errors.ErrorHandler$ /opt/ai-serving 11:26:55.508 WARN AI-Serving-akka.ai-predicting-dispatcher-34 a.a.serving.errors.ErrorHandler$ Model 'mnist' not found 11:26:55.519 WARN AI-Serving-akka.ai-predicting-dispatcher-35 a.a.serving.errors.ErrorHandler$ Model 'mnist' not found 11:27:13.224 WARN AI-Serving-akka.actor.default-dispatcher-33 a.a.serving.errors.ErrorHandler$ Model 'mnist' not found 11:27:13.236 WARN AI-Serving-akka.ai-predicting-dispatcher-37 a.a.serving.errors.ErrorHandler$ Model 'mnist' not found

@scorebot The test case that I executed was the AIServingMnistOnnxModel.ipynb example. It appears that the onnx model did not load properly after downloading the .tar file.

/ai-serving/examples quantum-fusion$ ls -lrt total 368 drwxr-xr-x 6 hottelet staff 192 Aug 3 2018 mnist -rw-r--r-- 1 hottelet staff 25962 Oct 30 2018 mnist.tar.gz -rw-r--r-- 1 hottelet staff 14937 Aug 11 10:55 AIServingIrisXGBoostPMMLModel.ipynb -rw-r--r-- 1 hottelet staff 20435 Aug 11 10:55 AIServingMnistOnnxModel.ipynb -rw-r--r-- 1 hottelet staff 4205 Aug 11 10:55 IrisXGBoost.ipynb -rw-r--r-- 1 hottelet staff 50454 Aug 11 10:55 ai_serving_pb2.py -rw-r--r-- 1 hottelet staff 60914 Aug 11 10:55 onnx_ml_pb2.py drwxr-xr-x 4 hottelet staff 128 Aug 11 11:21 pycache (base) MacBook-Pro:~/ai-serving/examples quantum-fusion$ ls -lrt mnist total 56 drwxr-xr-x 4 hottelet staff 128 Jun 20 2018 test_data_set_0 drwxr-xr-x 4 hottelet staff 128 Aug 3 2018 test_data_set_1 drwxr-xr-x 4 hottelet staff 128 Aug 3 2018 test_data_set_2 -rw-r--r-- 1 hottelet staff 26454 Oct 30 2018 model.onnx

Screen Shot 2020-08-11 at 11 30 27 AM
quantum-fusion commented 4 years ago

@scorebot Based on the example provided, it doesn't know how to modelServe the mist model.onnx using the command line to launch the ai-serving. Is there a better example? I need the general case, because my objective is to be able to modelServe any ONNX model from the ONNX model zoo.

scorebot commented 4 years ago

@quantum-fusion From the logs above, it seems the default service home /opt/ai-serving does not exist, or it's not a writable directory. Please, make sure it must exist and be writable. You can specify a different location instead of the default location, for example -Dservice.home="/path/to/writable-directory" when start the server. For more details, see here: https://github.com/autodeployai/ai-serving#server-configurations

quantum-fusion commented 4 years ago

@scorebot I ran the Juypter notebook and invoked the ai-server using the docker commands as shown in the notebook. docker run --rm -it -v $(PWD):/opt/ai-serving -p 9090:9090 -p 9091:9091 autodeployai/ai-serving

@scorebot The question I have, is it possible to specify which ONNX model you are loading into the server for model serving? The Jupiter notebook is loading the model in python at run time, but not on command line. The use case that I want is to load a custom ONNX model and then run a curl command with a JPEG picture, to provide the pattern back in JSON, for object image classification.

quantum-fusion commented 4 years ago

@scorebot See this example of how onnxruntime server accepts a command line onnx model. (https://github.com/onnx/tutorials/blob/master/tutorials/OnnxRuntimeServerSSDModel.ipynb) sudo docker run -it -v $(pwd):$(pwd) -p 9001:8001 mcr.microsoft.com/onnxruntime/server --model_path $(pwd)/ssd.onnx

quantum-fusion commented 4 years ago

@scorebot The question I have, is does your service have a --model_path option? If not, what is the syntax? docker run --rm -it -v $(PWD):/opt/ai-serving -p 9090:9090 -p 9091:9091 autodeployai/ai-serving --model_path $(pwd)/mymodel.onnx

quantum-fusion commented 4 years ago

@scorebot I want a REST service, so I can ask the model a question like in the MxNet server example (https://github.com/onnx/tutorials/blob/master/tutorials/ONNXMXNetServer.ipynb) curl -X POST http://127.0.0.1:8080/squeezenet/predict -F "input_0=@kitten.jpg"

scorebot commented 4 years ago

@quantum-fusion OK, you are using the docker to run AI-Serving. Let me explain how to predict an ONNX model using docker:

  1. First, run the docker in a shell window:

    docker run --rm -it -v $(pwd):/opt/ai-serving -p 9090:9090 -p 9091:9091 autodeployai/ai-serving

    Here, you don't need to specify a model path like ONNX runtime server, the model will be deployed in the next step. The environment $(pwd) must be a writable directory.

  2. Deploy an ONNX model into AI-Serving. Here, I used the model in the test dir of AI-Serving: ai-serving/src/test/resources

    curl -X PUT --data-binary @mnist.onnx -H "Content-Type: application/octet-stream"  http://localhost:9090/v1/models/mnist

    The model mnist.onnx is an ONNX model you want to deploy, the name mnist in the URL is the model identity that is used in the following predicting process. The model will be uploaded to the server, then deployed for the next predicting.

  3. Predict a deployed ONNX model:

    curl -X POST -d @mnist_request_0.json -H "Content-Type: application/json" http://localhost:9090/v1/models/mnist

    The API will return scoring results in the JSON format.

For steps 2 and 3, you can use any rest clients, for example python, as described in the example notebook: https://github.com/autodeployai/ai-serving/blob/master/examples/AIServingMnistOnnxModel.ipynb

quantum-fusion commented 4 years ago

@scorebot I want a REST service, so I can ask the model a question like in the MxNet server example (https://github.com/onnx/tutorials/blob/master/tutorials/ONNXMXNetServer.ipynb) curl -X POST http://127.0.0.1:8080/squeezenet/predict -F "input_0=@kitten.jpg"

@scorebot Can this syntax be modified like so?: curl -X POST -H "Content-Type: jpeg" http://localhost:9090/v1/models/mnist/predict -F "input_0=@kitten.jpg"

@scorebot What I like about your ai-serving is that it accepts ONNX models, and could accept any ONNX model. What isn't clear to me is what your REST API is, so if I want to use it like above, is there a /predict input that accepts pictures. I might want to use the squeeze net model from the model zoo (https://github.com/onnx/models/tree/master/vision/classification/squeezenet/model).

@scorebot See the example of the multi model server (https://github.com/awslabs/multi-model-server) curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg curl -X POST http://127.0.0.1:8080/predictions/squeezenet -T kitten.jpg

quantum-fusion commented 4 years ago

@scorebot To clarify, I believe what I am asking for, is your REST API spec for your AI-server. For example the TorchServe has a REST API specified here (https://github.com/pytorch/serve/blob/master/docs/inference_api.md#predictions-api), and there is a predictions API that is specified. I am able to input the following arguments, using the REST API /predictions:

@scorebot Question, does your AI-server have such an inference api such as predictions and will your REST API allow me to post a picture to the prediction service such to predict what the picture actually is, using the image classifier as a JSON response message.

quantum-fusion commented 4 years ago

@scorebot A syntax along these lines: curl -X POST -H "Content-Type: jpeg" http://localhost:9090/v1/models/squeezenet/predict -F "input_0=@kitten.jpg"

@scorebot Is this permitted, and if so, what is the REST API spec? Do you have an example?

scorebot commented 4 years ago

The REST API Spec of AI-Serving is here: https://github.com/autodeployai/ai-serving#rest-apis

An example of scoring ONNX models: https://github.com/autodeployai/ai-serving#scoring-onnx-models

AI-Serving can NOT support the following API that accepts an input image:

curl -X POST -H "Content-Type: jpeg" http://localhost:9090/v1/models/squeezenet/predict -F "input_0=@kitten.jpg"

Because the ONNX model does not include the preprocessions of image. Take the model queezenet as an example, the preprocessing operations are described here: https://github.com/onnx/models/tree/master/vision/classification/squeezenet#preprocessing The ONNX model knows nothing about them, so you need to do the preprocessions before calling the API of AI-Serving.

I wrote an example notebook for the model queezenet. Please, download it, and rename it (remove the end .txt, .ipynb file is not allowed to attach in issue), then move it into the dir ai-serving/examples, and download the model squeezenet1.1-7.onnx and the image kitten.jpg from the following links:

  1. https://github.com/onnx/models/blob/master/vision/classification/squeezenet/model/squeezenet1.1-7.onnx
  2. https://s3.amazonaws.com/model-server/inputs/kitten.jpg

Then, run the notebook, you can find AI-Serving supports both formats: JSON and Binary.

AIServingSqueezeNetOnnxModel.ipynb.txt

quantum-fusion commented 4 years ago

@scorebot I executed your Juypiter notebook for squeezenet as follows:

  1. docker run --rm -it -v $(PWD):/opt/ai-serving -p 9090:9090 -p 9091:9091 autodeployai/ai-serving

The results shown, show status messages of 285, but do not detect that there is a kitten in a json message response:

Screen Shot 2020-08-13 at 6 57 18 AM
scorebot commented 4 years ago

The onnx model squeezenet does not contain any info about class labels. Please, refer the notebook: https://github.com/onnx/models/blob/master/vision/classification/imagenet_inference.ipynb

The following file contains all class labels: https://s3.amazonaws.com/onnx-model-zoo/synset.txt

Download it, then run the following code:

with open('synset.txt', 'r') as f:
    labels = [l.rstrip() for l in f]

The value of labels[285] is 'n02124075 Egyptian cat', which should be the result you want

quantum-fusion commented 4 years ago

@scorebot The main problem that I thought Ai-server was solving was that it did not require the synset.txt files as inputs, however this is not the case. If I were to convert any ONNX model to MAR format from the ONNX model zoo, it requires an input synset.txt file with the class labels. The main problem I am having is that the model zoo does not publish these files. This is a major problem for me, and I don't know how to solve it.

quantum-fusion commented 4 years ago

@scorebot I wish to be able to model host with any custom ONNX model that I create, however I do not know where these synset.txt files come with the class labels. Is there a simple example that explains where they come from and how I create them? I can't find the specification anywhere.

quantum-fusion commented 4 years ago

@scorebot I offered my writeup and question to the torch community here: https://discuss.pytorch.org/t/imagenet-classes/4923/4

  1. PTH to MAR format using TorchServe

Pre-requisites to create a torch model archive (.mar) :

serialized-file (.pt) : This file represents the state_dict in case of eager mode model.

model-file (.py) : This file contains model class extended from torch nn.modules representing the model architecture. This parameter is mandatory for eager mode models. This file must contain only one class definition extended from torch.nn.modules

index_to_name.json : This file contains the mapping of predicted index to class. The default TorchServe handles returns the predicted index and probability. This file can be passed to model archiver using —extra-files parameter.

version : Model’s version.

handler : TorchServe default handler’s name or path to custom inference handler(.py)

PTH to MAR format (TorchServe) https://github.com/pytorch/serve/blob/master/examples/README.md

2.) ONNX to MAR using Multi-model-server

The downloaded model artifact files are:

Model Definition (json file) - contains the layers and overall structure of the neural network.

Model Params and Weights (params file) - contains the parameters and the weights.

Model Signature (json file) - defines the inputs and outputs that MMS is expecting to hand-off to the API.

assets (text files) - auxiliary files that support model inference such as vocabularies, labels, etc. These vary depending on the model.

ONNX to MAR format (Multi-Model-Server) https://github.com/awslabs/multi-model-server/blob/master/model-archiver/docs/convert_from_onnx.md

quantum-fusion commented 4 years ago

@scorebot the main problem is I don't know where the model artifact come from if I only have either a PT or ONNX model from the model zoo.

quantum-fusion commented 4 years ago

@scorebot The ONNX model zoo is here (https://github.com/onnx/models), but does not include or specify the class label files. This is the problem, I don't know how to model host with your AI-server or any server TorchServe or multi-model-server. That is because those artifact files that are required are not published. In addition, I find no specification or detail on how to create the files. That is the disconnect between the model server and the model published. Therefore, I don't see how model hosting is viable. Any thoughts?

scorebot commented 4 years ago

Yes, all image classification models from the ONNX model zoo do not provide labels, the output of ONNX models are just an array of numeric values, then calculating the softmax probability scores for each class, return indices of the most probable classes. I think the model publisher should provide the class labels with the model, then the postprocessing involves them, output the labels instead of indices.

quantum-fusion commented 4 years ago

@scorebot Do you know how these files are created? The model servers do not specify the format of the files:

scorebot commented 4 years ago

Sorry, I don't know how to create them. I think the model creator should know them.

quantum-fusion commented 4 years ago

@scorebot I found some explanation of a method called class_to_idx , which means class index , it implies use of class names for pictures as labels, and indexes. (https://discuss.pytorch.org/t/custom-label-for-torchvision-imagefolder-class/52300), see code here (https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/datasets/folder.py#L93-L94). I found image-net had quite extensive class names, and index files with pictures organized by subfolder and index numbers on their pictures. (https://github.com/thuml/HashNet/tree/master/pytorch#datasets). The files are quite large. I have been searching for a simpler example.

quantum-fusion commented 4 years ago

@scorebot There was a pretty good research paper that describes it, but not with great detail (https://www.researchgate.net/publication/316021174_Fusion_of_local_and_global_features_for_effective_image_extraction)

quantum-fusion commented 4 years ago

@scorebot Its these image-net people from Stanford, they know how to create the synset files see method here (https://pytorch.org/docs/1.1.0/_modules/torchvision/datasets/imagenet.html), http://image-net.org They appear to have 14,197,122 images, 21841 synsets indexed

quantum-fusion commented 4 years ago

@scorebot There is some explanation about synset format (http://image-net.org/download-API), but still do not know how to create them.

scorebot commented 4 years ago

@quantum-fusion Got them, thanks. Anyway, please, let me know if you have any suggestions about our AI-Serving.