very low fps of converted model

TNTWEN / OpenVINO-YOLOV4

This is implementation of YOLOv4,YOLOv4-relu,YOLOv4-tiny,YOLOv4-tiny-3l,Scaled-YOLOv4 and INT8 Quantization in OpenVINO2021.3

MIT License

240 stars 66 forks source link

very low fps of converted model #64

Open akashAD98 opened 2 years ago

akashAD98 commented 2 years ago

i m using openvino 20.04 version.

& also able to inference ,but fps im getting is 1-3. so why its low, my goal of using openvino to get high fps on cpu,but itts giving very less fps, is there any reason for this? how can we solve this issue?

TNTWEN commented 2 years ago

@akashAD98 Which model do you use? It should be faster if it is yolov4-tiny. If yolov4 is used, only 1-3 FPS is normal,because yolov4 is too large. Intel integrated graphics will be faster.

Also you could prune your yolo model by https://github.com/TNTWEN/Pruned-OpenVINO-YOLO to accelerate model reasoning if your dataset is not very large. https://github.com/TNTWEN/Pruned-OpenVINO-YOLO also shows how to convert your model to INT8. INT8 will be friendly to CPU

To sum up，my suggestion is that you can try yolov4-tiny+INT8 to get higher fps. If you have sufficient GPU resources，you could also try yolov4+Pruned model+INT8

akashAD98 commented 2 years ago

@TNTWEN im using yolov4 model, i tried this openvino notebook https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/401-object-detection-webcam & there fps is very high upto 50-60 on cpu (its mobilnetv2 model ).

so my goal is to get high fps for yolov4,yolov4x mish models, i tried this

https://dicksonneoh.com/portfolio/how_to_10x_your_od_model_and_deploy_50fps_cpu/#-post-training-quantization

able to convert into openvino ir form but not able to do inferenceing on converted form.

please let me know can we get high fps 50-60 on cpu for using this techniue for yolov4mish,csp model?

TNTWEN commented 2 years ago

@akashAD98

https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssdlite_mobilenet_v2 This model just has 1.525GFLOPs and 4.475MParams ,it's even smaller than yolov3tiny

YOLOv4 has 59.57 BFLOPS and 244 MParams. For many NVIDIA GPUs, it is difficult to reach 50-60fps！

And FP16 and FP32 are very slow on CPU. INT8 is necessary !

TNTWEN commented 2 years ago

@akashAD98 Maybe you could try https://github.com/Megvii-BaseDetection/YOLOX

akashAD98 commented 2 years ago

@TNTWEN thanks i will definitely try this , but what do you think can we get 50-60 fps on yolov4mish/csp? if we do Model optimiser

TNTWEN commented 2 years ago

@akashAD98 I think it's very hard for yolov4mish/csp. My friend used to use my repos( Pruned-OpenVINO-YOLO) prune yolov4 model from 244 MParams to 800 KParams （detect only one class）. Combined with INT8 ，he got 19FPS on CPU. But i don't know which CPU he used .