Open josephwnv opened 2 years ago
Config.pbtxt seems wrong. I started triton with --strict-model-config=false
and it loaded the model just fine. You can then query the generated config with this URL. Hope this helps!
@pranavsharma The config which you gave and is generated by Triton is for max_batch_size=0
as you can see on the line 10 of your config.json
. While this works if you don't want the dynamic batching feature of Triton. But since you see that the first dimension of input and output tensors are -1
which means they can support batch size>0.
So if you change the max_batch_size>0 and run again with config.json which has been updated on the original issue, Triton is not able to pick up the config and hence the error
Triton's documentation on the max_batch_size
and its effect is here
This example explains how the dims
of input and output tensors change when the max_batch_size>0
It doesn't look like this model supports dynamic batching. The first dimension of the outputs yolonms_layer_1/ExpandDims_1,:0
and yolonms_layer_1/ExpandDims_3:0
is 1, not -1. The error msg 0330 19:07:41.647411 1 model_repository_manager.cc:1186] failed to load 'yolov3-10_onnx' version 1: Invalid argument: model 'yolov3-10_onnx', tensor 'yolonms_layer_1/ExpandDims_1:0': for the model to support batching the shape should have at least 1 dimension and the first dimension must be -1; but shape expected by the model is [1,-1,4]
clearly says this. The model needs to be fixed.
Several issues have been reported that cite this error.
The outputs yolonms_layer_1/ExpandDims_1:0
and other outputs do support dynamic batch as shown by the dummy alphanumeric variables.
That's why the error you posted is confusing me as well that if dynamic batching is supported, it should be able to pick it up
Description Yolov3 onnx model not load
Triton Information What version of Triton are you using? 2.20 Are you using the Triton container or did you build it yourself? Yes, version 22.03
To Reproduce
download model from https://github.com/onnx/models/tree/main/vision/object_detection_segmentation/yolov3/model
config.pbtxr as follow name: "yolov3-10_onnx" platform: "onnxruntime_onnx" max_batch_size: 128 input [ { name: "input_1" data_type: TYPE_FP32 format: FORMAT_NCHW dims: [3, -1, -1 ] }, { name: "image_shape" data_type: TYPE_FP32 dims: [2] } ] output [ { name: "yolonms_layer_1/ExpandDims_1:0" data_type: TYPE_FP32 dims: [-1, 4] }, { name: "yolonms_layer_1/ExpandDims_3:0" data_type: TYPE_FP32 dims: [-1,-1] }, { name: "yolonms_layer_1/concat_2:0" data_type: TYPE_INT32 dims: [-1] } ] instance_group [ { kind: KIND_GPU count: 1 gpus: 0 } ]
sudo docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/home/josephw/server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:22.03-py3 tritonserver --model-repository=/models
And see the following error
0330 19:07:41.647411 1 model_repository_manager.cc:1186] failed to load 'yolov3-10_onnx' version 1: Invalid argument: model 'yolov3-10_onnx', tensor 'yolonms_layer_1/ExpandDims_1:0': for the model to support batching the shape should have at least 1 dimension and the first dimension must be -1; but shape expected by the model is [1,-1,4]
If the problem appears to be a bug in the execution of the model itself, first attempt to run the model directly in ONNX Runtime. What is the output from loading and running the model in ORT directly? If there is a problem running the model directly with ORT, please submit an issue in the microsoft/onnxruntime (github.com) project.
If the problem appears to be in Triton itself, provide detailed steps to reproduce the behavior in Triton.
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
Expected behavior A clear and concise description of what you expected to happen. Expect to load successfully