marcoslucianops / DeepStream-Yolo

NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models
MIT License
1.45k stars 357 forks source link

No detection with int8 #336

Closed storm12t48 closed 1 year ago

storm12t48 commented 1 year ago

Hello every one

I allow myself to ask for your help, I followed the tutorial to switch to int8 everything sucked well to create not without difficulty but I do have a model model_b1_gpu0_int8.engine which was created but the problem is that the video is launches but NO DETECTION PS: on the screen all go well in fp32 I specify, I work on an nvidia 6.2 trion docker

Cordially

124bit commented 1 year ago

Deepstream 6.2, gst-nvinfer, yolov8, int8 - nearly zero detections after calibration too. 1% of detections available. Calibration was with 1.5k images from the training dataset and batch size 7.

Fp32 and fp16 work ok, however even fp32 is a bit worse than just running through pytorch.

I dream about high-quality yolov8 int8.

P.S. Author, big thanks for the repo!

storm12t48 commented 1 year ago

It's that I thought my yolov8 programs with TensorRT engine in GPU gives me very better results in Fp16 with an easy display possibility and performance of 10-15 ms / 15-16 fps / 10-15 Watts in processing by Frame Deepstream plays on performance certainly you have more fps but hey if there's nothing to detect I don't see much use In addition the pre-cluster-threshold variable gives the impression that it has a minimum value blocked or it will not detect more if it has decreased again or i did not understand thanks for the repo! :p nice job

storm12t48 commented 1 year ago

I found this if it can help someone ,you can use python apps with examples for Deepstream sur le repository de nvidia :https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/tree/master

marcoslucianops commented 1 year ago

For now, the INT8 calibration is a basic implementation. My plan is to optimize it in the future.

Fp32 and fp16 work ok, however even fp32 is a bit worse than just running through pytorch.

https://github.com/marcoslucianops/DeepStream-Yolo/issues/339

In addition the pre-cluster-threshold variable gives the impression that it has a minimum value blocked or it will not detect more if it has decreased again or i did not understand

The pre-cluster-threshold is equal to conf-thres on PyTorch YOLO models.

124bit commented 1 year ago

Thanks for the answer! We are gradually building the test suit to do educated validation.

Can't wait for good int8

marcoslucianops commented 1 year ago

I did some small adjusts in the INT8 calibration (PTQ) and it is also available for ONNX exported models now. In the future, I'm planing to add the QAT method to get better accuracy.

124bit commented 1 year ago

Thanks, we will try!