Open YunghuiHsu opened 1 year ago
in https://github.com/NVIDIA-AI-IOT/yolo_deepstream/tree/main/tensorrt_yolov7#prepare-tensorrt-engines
Suggest explicitly specifying “dynamic-batch”, and the problem was solved!
# int8 QAT model, the onnx model with Q&DQ nodes
/usr/src/tensorrt/bin/trtexec --onnx=yolov7qat.onnx --saveEngine=yolov7QAT.engine --fp16 --int8
with
# int8 QAT model, the onnx model with Q&DQ nodes and dynamic-batch
/usr/src/tensorrt/bin/trtexec --onnx=yolov7qat.onnx \
--minShapes=images:1x3x640x640 \
--optShapes=images:12x3x640x640 \
--maxShapes=images:16x3x640x640 \
--saveEngine=yolov7QAT.engine --fp16 --int8
However, when testing performance with /usr/src/tensorrt/bin/trtexec --loadEngine=yourmodel.engine
, the performance of the engine that is explicitly specified as dynamic batch is much worse.
yolov7QAT.engine
=== Performance summary ===
[I] Throughput: 57.8406 qps
[I] Latency mean = 17.8946 ms
yolov7QAT.engine with dynamic batch(max=16)
=== Performance summary ===
[I] Throughput: 23.8396 qps
[I] Latency: mean = 42.046 ms
When I refer to yolo_deepstream/tree/main/tensorrt_yolov7 and use "yolov7QAT" to perform a batch detection task, the following error occurs
./build/detect --engine=yolov7QAT.engine --img=./imgs/horses.jpg,./imgs/zidane.jpg
Error Message
Note
Environment