jkjung-avt / tensorrt_demos

TensorRT MODNet, YOLOv4, YOLOv3, SSD, MTCNN, and GoogLeNet
https://jkjung-avt.github.io/
MIT License
1.74k stars 545 forks source link

How to set dynamic batch_size in Tensorrt model? #511

Closed 3a532028 closed 2 years ago

3a532028 commented 2 years ago

I want to set dynamic batch_size in tensorrt model for Triton service purpose. How to do it?

EyGy commented 2 years ago

Afaik tensorRT still requires fixed input batch size: See https://github.com/jkjung-avt/tensorrt_demos/issues/25

jkjung-avt commented 2 years ago

This discussion might be helpful: https://github.com/jkjung-avt/tensorrt_demos/issues/457

You could also check out the dynamic batch size plugin implementation by @philipp-schmidt: https://github.com/jkjung-avt/tensorrt_demos/pull/465

philipp-schmidt commented 2 years ago

Hello everyone,

dynamic batch size requires a different implementation of the plugin. It is only the interface that changes, not the actual code / computation. I have submitted a merge request that I could not finish properly till now unfortunately. It works well with the standard yolov4 but I was seeing some errors for other variants (e.g. yolov4-tiny-3l). This suggests that some of the "dimension calculations" functions is not generalized well for all variants and calculates stuff wrongly.

ONNX unfortunately does not support all plugin implementations, so we can not just use my plugin implementation and settings from yolov4-triton-tensorrt repo. ONNX rejects that. Dynamic batching works there when I implement the network layer by layer in TensorRT without ONNX.

You can have a look at the changes I made in my PR and double check them. Jkjung didn't have the time yet which is absolutely fair.

philipp-schmidt commented 2 years ago

@EyGy that's not true, my repo shows otherwise. I can use dynamic batch size. It is the ONNX loader which expects fixed size for plugin layers or the most recent plugin interface.

I don't have the details, but the main difference is the "mode" that you load networks in TensorRT. It is a flag of the builder in the code.

philipp-schmidt commented 2 years ago

I think there is a mode where you can have a dynamic batch (!) size but are limited to static input dimension (TensorRT will just assume the batch size for a static input definition). Unfortunately ONNX rejects that mode.

EyGy commented 2 years ago

Very interesting, so i guess i had it wrong... Thanks for the clarification!