autodeployai / ai-serving

Serving AI/ML models in the open standard formats PMML and ONNX with both HTTP (REST API) and gRPC endpoints
Apache License 2.0
148 stars 31 forks source link

ML model.onnx file not picked up by latest 0.9.2-cuda 0.9.3-cuda from docker #6

Closed htsoi-boto closed 3 years ago

htsoi-boto commented 3 years ago

models located in models/model.onnx

docker-compose file

`version: '2.3' services: aiserving: image: autodeployai/ai-serving:0.9.2-cuda container_name: aiserving runtime: nvidia ports:

By default models are served to /opt/ai-serving but these are not picked by docker upon start.

This deploy mechanism still works: curl -X PUT --data-binary @model.onnx -H "Content-Type: application/octet-stream" http://localhost:9090/v1/models/model

Anyone else having issues loading models to the default directory? Does it not pick up files in /opt/ai-serving ending in .onnx?

scorebot commented 3 years ago

@htsoi-boto ai-serving will not load the model model.onnx, the internal storage structure likes that:

/opt/ai-serving/
└── models
    ├── iris
    │   ├── 1
    │   │   ├── model
    │   │   └── version.json
    │   ├── 2
    │   │   ├── model
    │   │   └── version.json
    │   ├── 3
    │   │   ├── model
    │   │   └── version.json
    │   └── model.json
    ├── mnist
    │   ├── 1
    │   │   ├── model
    │   │   └── version.json
    │   └── model.json
    └── super
        ├── 1
        │   ├── model
        │   └── version.json
        └── model.json

Under the subdirectory models, there are three models iris, mnist, and super, the iris has three versions, the others have only one version, the model file is always located at models/{model_name}/{model_version}/model. I suggest deploying a model by the API above, which will create metadata info for such model saved in model.json and version.json to help model metadata retrieving/validating quickly.

htsoi-boto commented 3 years ago

I have a specific tinybert model (tinybert.onnx) that I have trained, how would this model be characterized?

I can absolutely load it though:

curl http://localhost:9090/v1/models/tinybert -H 'Content-Type: application/x-protobuf' -X PUT --data-binary @tinybert/tinybert.onnx

and retrieve the model

scorebot commented 3 years ago

The model tinybert will be saved in such a manner after it's deployed successfully.

/opt/ai-serving/
└── models
    └── tinybert
        ├── 1
        │   ├── model
        │   └── version.json
        └── model.json

Then, you can invoke the following API to make prediction curl -X POST -d @data.json -H "Content-Type: application/json" http://localhost:9090/v1/models/tinybert

htsoi-boto commented 3 years ago

Awesome. This helps a lot. Thanks much!