AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.79k stars 7.97k forks source link

When will jetpack4.4 (CUDA 10.2, cuDNN 8.0) be fully supported? #6585

Open yacad opened 4 years ago

yacad commented 4 years ago

Hi @AlexeyAB.

When will jetpack4.4 (CUDA 10.2, cuDNN 8.0) be fully supported?

In jetpack4.3 (CUDA 10.2, cuDNN 7.6.3), setting CUDNN=1 increases the speed. However, in jetpack4.4 (CUDA 10.2, cuDNN 8.0), setting CUDNN=1 still slows the speed.

When can I increase the speed by setting CUDNN=1 in jetpack4.4 (CUDA 10.2, cuDNN 8.0)?

I previously asked NVIDIA with the same problem on jetpack4.4DP, but did not get an answer. https://forums.developer.nvidia.com/t/darknet-yolo-slowing-down-when-using-jetpack4-4s-cudnn-8-0-0-on-jetson-xavier-nx-and-jetson-nano/123698 At that time, jetpack4.4 was a developer version(jetpack4.4DP), but now that it has been released, I think this problem should be resolved.

Thanks @AlexeyAB.

Shame-fight commented 4 years ago

you can use yolov4_tensorrt,it will be faster 2-3 times than darknet.

yacad commented 4 years ago

Thanks for your reply @Shame-fight

I already know that using tensorrt speeds it up. I don't want to use tensorrt and I want cuDNN 8.0 working fine on darknet to get faster framerate. Just as the speed got faster in cuDNN 7.6.3.

miziacz commented 4 years ago

nvidia just released cudnn 8.0.3

https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_8.html#rel-803

The performance of cudnnConvolutionBiasActivationForward() for INT8x4 use cases on Volta and Turing, INT8x32 use cases on Turing, FP32 and pseudo-FP16 use cases on Volta, Turing, and Ampere GPU architecture have been improved.

haven't tried it yet, but you can see if it improves your use case