Yolov5 FP16 engine TensorRT model was not faster than FP32

marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models

MIT License

1.39k stars 345 forks source link

Yolov5 FP16 engine TensorRT model was not faster than FP32 #112

Closed BTVinh0409 closed 2 years ago

BTVinh0409 commented 2 years ago

Hi, I tried converting yolov5l.pt model to FP32 TensorRT engine. It works well and the FPS is around 60. But when I try to convert to FP16 it doesn't seem to increase the FPS any further. And both exported models have the same size. Maybe I did something wrong?

mfoglio commented 2 years ago

Is it possible that your GPU does not support FP16 inference?

juanfeyero commented 2 years ago

I have the same issue with Yolov4, when I change to FP16, the file size is the same like FP32 and the FPS are the same on both.

mfoglio commented 2 years ago

What GPU are you using?

juanfeyero commented 2 years ago

I am using the Jetson AGX Xavier.

BTVinh0409 commented 2 years ago

hi @mfoglio, I am using RTX 2080 and it supports FP16 inference. Can you provide the FPS when you inference FP32 and FP16 models? Thank you so much

marcoslucianops commented 2 years ago

I will check it

juanfeyero commented 2 years ago

I noticed that the same issue happens when you already have the calib.table and you build the engine on INT8.

DavidBaldsiefen commented 2 years ago

Any update? I have the same issue here, but maybe I am missing something. I am running a custom yolor-csp model and get around 20fps on the Jetson AGX Xavier. Changing network-mode from 0 to 2 does not change the fps at all

marcoslucianops commented 2 years ago

I need more days to check, I'm full of work these days.

pakike commented 2 years ago

I fixed this bug by following the link below:

https://forums.developer.nvidia.com/t/deepstream-6-yolo-performance-issue/194238/22

Mainly you have to modify this files: yolo.cpp yolo.h nvdsinfer_yolo_engine.cpp

marcoslucianops commented 2 years ago

Mainly you have to modify this files:

If you do that, you will lose all optimizations and new models did by this repo.

pakike commented 2 years ago

Just only put the new line in the file, for example for yolo.cpp you need delete this line and put the new line.

nvinfer1::ICudaEngine Yolo::createEngine (nvinfer1::IBuilder builder) +nvinfer1::ICudaEngine *Yolo::createEngine (
nvinfer1::IBuilder builder, nvinfer1::IBuilderConfig config)

georgetsu commented 2 years ago

I confirm, after steps mentioned by Pakike engine model size for FP16 decreased which caused FPS increased to normal like in DS5.1 version. Previously FP16 TRT engine model size was the same as FP32.

marcoslucianops commented 2 years ago

Thank you @pakike, I confirmed the issue and updated the repo.