enazoe / yolo-tensorrt

TensorRT8.Support Yolov5n,s,m,l,x .darknet -> tensorrt. Yolov4 Yolov3 use raw darknet *.weights and *.cfg fils. If the wrapper is useful to you,please Star it.
MIT License
1.18k stars 316 forks source link

INT8 engine building is too slow #159

Open victor-yudin opened 2 years ago

victor-yudin commented 2 years ago

Hi everyone,
I faced the problem during the launching the YOLOv4 inference with INT8 precision on RTX 3090 GPU: the buildEngineWithConfig() method is very slow (had been running for 1.5 hours, when I interrupted the process). I tried to increase MaxWorkspaceSize from 1MiB (1<<20) to 32, 64, 512 Mib, but unsuccessfully.
~1k images are used for INT8 calibration.

Engines with computing precision FP32, FP16 are built about 1 minute.

Environment: Ubuntu 20.04
TensorRT 7.2.1
cuda 11.1
cudnn 8

The inference with the same configuration works well on laptop with RTX 2070 (building the INT8 engine takes ~12 minutes).

enazoe commented 2 years ago

decrease the calibration images to ~100

victor-yudin commented 2 years ago

decrease the calibration images to ~100

Thank you! It helped.