FeiYull / TensorRT-Alpha

🔥🔥🔥TensorRT for YOLOv8、YOLOv8-Pose、YOLOv8-Seg、YOLOv8-Cls、YOLOv7、YOLOv6、YOLOv5、YOLONAS......🚀🚀🚀CUDA IS ALL YOU NEED.🍎🍎🍎
GNU General Public License v2.0
1.28k stars 198 forks source link

yolov8's tensorRT int8 quantization #13

Closed mattherdelma closed 1 year ago

mattherdelma commented 1 year ago

How to implement yolov8's tensorRT int8 quantization?

FeiYull commented 1 year ago

@mattherdelma @lyb36524 He made a FP16 quantification of the ONNX model, through the following instructions: ./trtexec --onnx=yolov8n.onnx --saveEngine=yolov8n_fp16.trt --fp16 --buildOnly --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:8x3x640x640 Other instructions are the same as in the link: The figure below is the time overhead, the test device is Nvidia RTX4090, the BATCH_SIZE = 8:

FP32: fp32

FP16: fp16

More quantitative tutorials, we will update in the future