Open chenchunhui97 opened 1 month ago
Hi @chenchunhui97,
generate onnx for server (with torch version 2.1.2 )
If your bert
model is an ONNX model, then you should be specifying the onnxruntime
backend in the config.pbtxt
, but from the logs it looks like pytorch
is specified.
Also, I notice your triton version is ~3 years old, if you update to the latest you should be able to take advantage of autocomplete and Triton can infer the correct minimal config.pbtxt
from your ONNX model as described here: https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#auto-generated-model-configuration
Description bug when deploying Macbert
Triton Information I use the official image: nvcr.io/nvidia/tritonserver:21.09-py3
To Reproduce
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
Expected behavior launch successfully. and one more question: how to modify the service port? (from 8000,8001,8002 to other cuatomized ports)