Closed kgksl closed 3 years ago
I think dynamic input shape for TensorRT YOLO engines is possible. It would require quite some modification to the code in this repo, though:
The YoloLayerPlugin
class needs to inherit from IPluginV2DynamicExt instead of IPluginV2IOExt. Refer to source code here.
The plugin could not save fixed yolo_width
, yolo_height
, input_width
and input_height
. It needs to derive those values from its input tensor shape.
Set proper net_h
and net_w
range in the TensorRT optimization profile. Note that those dimensions (H & W) should be set to -1 (dynamic) in the ONNX file.
At inference time, the input shape could not be dynamic. So you need to set input shapes manually for the test images.
I don't guarantee this is an exhaustive list. But anyway, those are the things on top of my head.
@jkjung-avt thanks alot for the quick reply.
Do we need to modify yolo_to_onnx.py
code as well? or we dont need as the we perfrom ONNX to TensorRT conversion later and that defines the output shapes etc?
Do we need to modify yolo_to_onnx.py code as well? or we dont need as the we perfrom ONNX to TensorRT conversion later and that defines the output shapes etc?
As stated in my previous post, you need to set input tensor's H and W dimensions to -1 (dynamic) in the ONNX model. The source code is as shown below. (You need to set height
and width
to -1.)
@jkjung-avt sorry I missed it. Thanks alot!
do we need to modify the below lines as well? https://github.com/jkjung-avt/tensorrt_demos/blob/9dd56b3b8d841dcfc2e5d1868f4bd785a50cbe98/yolo/yolo_to_onnx.py#L979-L990
do we need to modify the below lines as well?
Most likely yes. Try setting h and w of all output tensors to -1 as well.
thank you.
What would we need to change for the plugin to only make the batch size dynamic, but leave all other input sizes static? @jkjung-avt
This would be pretty cool for deploying to Triton Inference Server, because it can do Dynamic Batch Scheduling, which bundles single requests up to bigger batches for higher throughput.
I have implemented a draft for dynamic batch size plugin which works for yolov4, but still has issues with e.g. yolov4-tiny-3l.
I saw that TensorRT supports dynamic input shapes and thought this will allow us to use YOLO models trained with
random=1
to be used with a range of network sizes with a single TensorRT engine. Still not 100% sure of the process. However it means we need to have a onnx model that supports dynamic input sizes as well, and I couldn't find if we can do this by modifyingyolo_to_onnx.py
. As I can see from the code, the output tensor shape for the ONNX graph needs to be defined and for YOLO models, is defined based on the network width and height read from cfg. Can we define a dynamic output shape for ONNX and is that the only thing we would need to change in this case? thank you.