quite slow inference on GPU 2070super

dog-qiuqiu / Yolo-Fastest

:zap: Based on yolo's ultra-lightweight universal target detection algorithm, the calculation amount is only 250mflops, the ncnn model size is only 666kb, the Raspberry Pi 3b can run up to 15fps+, and the mobile terminal can run up to 178fps+

Other

1.97k stars 428 forks source link

quite slow inference on GPU 2070super #42

Open thunder95 opened 3 years ago

thunder95 commented 3 years ago

I tested yolo-fastest on my PC, which run quite slowly with almost 25ms per frame on average. The XL version also takes even much more consuming time with 45ms. BTW, the version of my GPU is 2070super. Did I test it not correctly? Need help.

Hezhexi2002 commented 2 years ago

The model is cpu inference friendly because it use depthwise separable convolution to reduce the size of itself