jkjung-avt / tensorrt_demos

TensorRT MODNet, YOLOv4, YOLOv3, SSD, MTCNN, and GoogLeNet
https://jkjung-avt.github.io/
MIT License
1.74k stars 545 forks source link

Squared actual batch size when exporting yolov4-tiny, from yolo to onnx to trt #545

Open Pochingto opened 2 years ago

Pochingto commented 2 years ago

Hi, thanks for having this awesome repo, it helps a lot when dealing with tensorrt.

I am trying to export a tensorrt model of yolov4-tiny (nc=1) with a batch size of 3, I see there is a MAX_BATCH_SIZE parameters in yolo_to_onnx.py and onnx_to_tensorrt.py respectively. Same as #373, if the 2 MAX_BATCH_SIZE parameters are different, an error was thrown.

However, with both MAX_BATCH_SIZE set to 3, although the model can be exported without error, it actually exports a model with batch size squared. Checking the original model output shape: (8505, ) The exported model output shape: (76545, ), so the batch size is actually 9 instead of 3.

What setting should I use in order to get a correct batch size of 3?

I am providing the error logs below when the two MAX_BATCH_SIZE are different: Setting onnx to batch size 3 and tensorrt to batch size 1

[TensorRT] ERROR: 009_route: all concat input tensors must have the same dimensions except on the concatenation axis (1), but dimensions mismatched at index 0. Input 0 shape: [1,64,72,72], Input 1 shape: [3,64,72,72]

Naming the input tensort as "input".
Building the TensorRT engine.  This would take a while...
(Use "--verbose" or "-v" to enable verbose logging.)

[TensorRT] ERROR: 004_route: out of bounds slice, input dimensions = [1,64,72,72], start = [0,0,0,0], size = [3,32,72,72], stride = [1,1,1,1].
[TensorRT] ERROR: Layer 004_route failed validation
[TensorRT] ERROR: Network validation failed.
ERROR: failed to build the TensorRT engine!

Setting onnx to batch size 1 and tensorrt to batch size 3

[TensorRT] ERROR: 009_route: all concat input tensors must have the same dimensions except on the concatenation axis (1), but dimensions mismatched at index 0. Input 0 shape: [3,64,72,72], Input 1 shape: [1,64,72,72]

Naming the input tensort as "input".
Building the TensorRT engine.  This would take a while...
(Use "--verbose" or "-v" to enable verbose logging.)

[TensorRT] ERROR: 009_route: all concat input tensors must have the same dimensions except on the concatenation axis (1), but dimensions mismatched at index 0. Input 0 shape: [3,64,72,72], Input 1 shape: [1,64,72,72]
[TensorRT] ERROR: 009_route: all concat input tensors must have the same dimensions except on the concatenation axis (1), but dimensions mismatched at index 0. Input 0 shape: [3,64,72,72], Input 1 shape: [1,64,72,72]
[TensorRT] ERROR: Layer 009_route failed validation
[TensorRT] ERROR: Network validation failed.

ERROR: failed to build the TensorRT engine!
jkjung-avt commented 2 years ago

Sorry. As much as I wanted to look into this, I found it difficult for myself to find time to investigate it...

The main reason why I kept MAX_BATCH_SIZE as a separate constant in "yolo_to_onnx.py" code is that I could set it to -1 (let the batch dimension be dynamic in the ONNX model). Could you help to try if "setting ONNX to batch size -1 and TensorRT to batch size 3" would work for you?

Pochingto commented 2 years ago

Thx for looking into this. Unfortunately, the batch size is still squared (3x3=9). I would stick to using a batch size of 4 (2x2) for a while, plz take your time on this issue :-)