triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.39k stars 1.49k forks source link

dynamic shape onnx error #7575

Closed chenchunhui97 closed 2 months ago

chenchunhui97 commented 2 months ago

Description I have onnx files to deploy using tritonserver:24.01-py3, but encountered error. as the shape on onnx input/output is dynamic, instead of fixed. so maybe the config.pbtxt is incorrect. (I use model_analyzer to analyze the performance of this model, but launch failed)

Triton Information use tritonserver images to deploy directly, of version 24.01-py3

Are you using the Triton container or did you build it yourself? No

To Reproduce

this is input of onnx model:

image and output:

image

my own config.pbtxt:

name: "rec_onnx"
platform: "onnxruntime_onnx"
max_batch_size : 8

input [
  {
    name: "x"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [ 3, 48, 320 ]
  }
]
output [
  {
    name: "softmax_11.tmp_0"
    data_type: TYPE_FP32
    dims: [ 40, 6625 ]
  }
]

Expected behavior

can launch the model and call it successfully. (I need to set a specific shape for model inference when profiling this model.)

chenchunhui97 commented 2 months ago

error log:

Model rec_performance load failed: [StatusCode.INTERNAL] failed to load 'rec_performance', failed to poll from model repository
chenchunhui97 commented 2 months ago

Actually tritonserver can launch the model without config.pbtxt, but I want to analyze the model prerformance so I must specific the input and output shape.

chenchunhui97 commented 2 months ago
name: "rec_onnx"
platform: "onnxruntime_onnx"
max_batch_size : 8

input [
  {
    name: "x"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [ 3, 48, 320 ]
  }
]
output [
  {
    name: "softmax_11.tmp_0"
    data_type: TYPE_FP32
    dims: [ -1, 6625 ]
  }
]

seems work