Open 12343954 opened 2 years ago
You say:
BUT, in my yolov3, it's very fast. same machine, same hard devices.
...but is is not the same installation is it? The first one starts with this:
CUDA-version: 10010 (11010), cuDNN: 7.6.5, CUDNN_HALF=1, GPU count: 1 OpenCV version: 4.2.0 compute_capability = 750, cudnn_half = 1
...and the 2nd one starts with this:
CUDA-version: 11040 (11040), cuDNN: 8.2.4, GPU count: 1 OpenCV version: 4.5.5 0 : compute_capability = 750, cudnn_half = 0, GPU: NVIDIA GeForce RTX 2060
So obviously you have:
All of which were built in different ways. At the very least use the same installed version darknet etc to compare the yolov3 and yolov4 numbers.
On my video card, this is what I get when I compare YOLOv4 and YOLOv3:
Tiny:
./darknet detector test cfg/coco.data cfg/yolov3-tiny.cfg yolov3-tiny.weights data/dog.jpg -dont_show data/dog.jpg: Predicted in 483.695000 milli-seconds.
./darknet detector test cfg/coco.data cfg/yolov4-tiny.cfg yolov4-tiny.weights data/dog.jpg -dont_show data/dog.jpg: Predicted in 482.337000 milli-seconds.
Full:
./darknet detector test cfg/coco.data cfg/yolov3.cfg yolov3.weights data/dog.jpg -dont_show data/dog.jpg: Predicted in 499.029000 milli-seconds.
./darknet detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights data/dog.jpg -dont_show data/dog.jpg: Predicted in 511.460000 milli-seconds.
As you can see, the v3 and v4 numbers are nearly identical. You should be getting similar results.
And if you're looking for a faster tool, these are the results using DarkHelp:
Tiny:
DarkHelp cfg/yolov3-tiny.cfg yolov3-tiny.weights data/coco.names data/dog.jpg -> prediction took 3.865 milliseconds
DarkHelp cfg/yolov4-tiny.cfg yolov4-tiny.weights data/coco.names data/dog.jpg -> prediction took 4.755 milliseconds
Full:
DarkHelp cfg/yolov3.cfg yolov3.weights data/coco.names data/dog.jpg -> prediction took 16.077 milliseconds
DarkHelp cfg/yolov4.cfg yolov4.weights data/coco.names data/dog.jpg -> prediction took 32.306 milliseconds
@stephanecharette thank you for reply!
yolo v3 is installed on Windows 10 hard drive. yolo v4 is installed on another Windows 11 hard drive. They are all isolated from each other.
I don't understand why the speed difference between the two is so great.
I only see one place different. yolov3.cudnn_half =1, yolov4.cudnn_half = 0
I only see one place different.
I don't know why you say that. I pointed out all the differences to you above. Let me point them out again:
The first one starts with this:
CUDA-version: 10010 (11010), cuDNN: 7.6.5, CUDNN_HALF=1, GPU count: 1 OpenCV version: 4.2.0 compute_capability = 750, cudnn_half = 1
...and the 2nd one starts with this:
CUDA-version: 11040 (11040), cuDNN: 8.2.4, GPU count: 1 OpenCV version: 4.5.5 0 : compute_capability = 750, cudnn_half = 0, GPU: NVIDIA GeForce RTX 2060
So you have different hardware, different versions of CUDA, different versions of CUDNN, different versions of OpenCV, different configurations for darknet, and different versions of darknet.
@stephanecharette , thank you for reply! they are on the same machine, same hard device. just not the same operating system. I don't think the difference in software version can cause such a big difference in results.
I very much disagree. See my blog post on Darknet and FPS. Changes in software alone can make differences from 6.1 FPS to 71.5 FPS:
And on my RTX2070, the difference was even greater, from 5.3 FPS up to 209.7 FPS:
Source: https://www.ccoderun.ca/programming/2021-10-16_darknet_fps/
your 5.3 FPS is under the CPU, not the GPU 177.5 FPS is under the CUDA 209.7 FPS is under the CUDA + cuDNN they are not under the same hard device. so the FPS are great different.
209.7 is more than 177.5,because the cuDNN worked.
my yolov4 ETA is 436.8560 ms, i think GUP or cuDNN acceleration is not turn on,but I have no evidence.
it might be different CMAKE_CUDA_ARCHITECTURES. Please, on both win10 and win11, make sure to have cmake updated to latest version (3.23), update cuda to 11.6 and cudnn to 8.4, then inside vcpkg do a
git pull .\bootstrap-vcpkg.bat .\vcpkg upgrade --no-dry-run to make sure to update darknet everywhere and to compare apples with apples
Then please re-post results :)
@cenit thanks for reply!
PS D:\darknet\vcpkg\installed\x64-windows\tools\darknet> ./darknet detector test ./cfg/coco.data ./cfg/yolov4.cfg yolov4.weights data/dog.jpg CUDA-version: 11060 (11060), cuDNN: 8.4.0, GPU count: 1 OpenCV version: 4.5.5 0 : compute_capability = 750, cudnn_half = 0, GPU: NVIDIA GeForce RTX 2060 net.optimized_memory = 0 mini_batch = 1, batch = 8, time_steps = 1, train = 0 layer filters size/strd(dil) input output 0 Create CUDA-stream - 0 Create cudnn-handle 0 conv 32 3 x 3/ 1 608 x 608 x 3 -> 608 x 608 x 32 0.639 BF 1 conv 64 3 x 3/ 2 608 x 608 x 32 -> 304 x 304 x 64 3.407 BF 2 conv 64 1 x 1/ 1 304 x 304 x 64 -> 304 x 304 x 64 0.757 BF 3 route 1 -> 304 x 304 x 64 ... [yolo] params: iou loss: ciou (4), iou_norm: 0.07, obj_norm: 1.00, cls_norm: 1.00, delta_norm: 1.00, scale_x_y: 1.05 nms_kind: greedynms (1), beta = 0.600000 Total BFLOPS 128.459 avg_outputs = 1068395 Allocate additional workspace_size = 18.88 MB Loading weights from yolov4.weights... seen 64, trained: 32032 K-images (500 Kilo-batches_64) Done! Loaded 162 layers from weights-file Detection layer: 139 - type = 28 Detection layer: 150 - type = 28 Detection layer: 161 - type = 28 data/dog.jpg: Predicted in 502.524000 milli-seconds. 😱😱😱 bicycle: 92% dog: 98% truck: 92% pottedplant: 33%
Hi, I have the same problem. Inference time is too slow even though I set the GPU and CUDNN=1. I think you found the way to turn on the GPU or CUDA. Please let us know the solution. Thanks in advance.
Hi, I have the same problem. Inference time is too slow even though I set the GPU and CUDNN=1. I think you found the way to turn on the GPU or CUDA. Please let us know the solution.
The solution I recommend is to use this Darknet/YOLO repo: https://github.com/hank-ai/darknet#table-of-contents
The repo you are attempting to use is no longer maintained.
If something doesn’t work for you, then show 2 screenshots:
i used the
PS D:\> .\vcpkg install darknet[full]:x64-windows
to install darknet successfully! after testing, i found that the detection was too slow! it looks like the GPU isn't speeding up!BUT, in my yolov3, it's very fast. same machine, same hard devices.