yasenh / libtorch-yolov5

A LibTorch inference implementation of the yolov5
MIT License
372 stars 114 forks source link

Performance on Win10 with GPU #32

Closed SHKChan closed 3 years ago

SHKChan commented 3 years ago

My device is I5 with GPU 1080TI 11GB, and I have successfully complie and run on WIN10 with GPU, but why the inference take taht much time(Release mode)? Already coment out the warm up part in the main.cpp, and it will still take around 500ms to process one single image. But when I using the same model to do detection in python, it works much more efficient with 20FPS. Dont know whats wrong with my configuration or any other issue is the C++ project decrease the performance.

yasenh commented 3 years ago

Hi @SHK2018,

  1. First of all you don't need to comment out the warm up part, I put it there to make sure the real inference time will be measured more accurate.
  2. Another way is to process a video or more images in a loop, and take a look at the inference time
  3. If it still take 500ms to process, then make sure you export the model on GPU and run with "--gpu" flag, you could check out torchscript-model-export for more info. In additional, you can monitor your GPU usage as well.
  4. I also tested this repo on Win10 before, and it works as expected, so please provide your feedback if you still get the issue

Hope it helps!

SHKChan commented 3 years ago

Thank you, I try it on the video with the warm up part, it works totally fine now with 9ms per frame. Thanks a lot.

SHKChan commented 3 years ago

Hi @yasenh, kind of curious about the first two frame interpret time issue ON GPU. The first frame which is an empty image will take 200-300 ms and the second one which is my incoming image will take more than 1 s(forget the exact number but much slower than the first frame). And the following image will just consume 10 ms per frame. Wonder what reason for the time issue of the second image which consume much more time than the rest of frame? Thanks for you help!

yasenh commented 3 years ago

Could you try following set up in your win10 environment?

set env: setx CUDA_LAUNCH_BLOCKING 1

Then : ./libtorch-yolov5 --source ../images/bus.jpg --weights ../weights/yolov5s.torchscript.pt --gpu --view-img