AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

Comparison of some models on CPU vs VPU (neurochip) vs GPU #5079

Open AlexeyAB opened 4 years ago

AlexeyAB commented 4 years ago

Accuracy and FPS:

Model AP50...95 (MSCOCO), accuracy mAP50 (MSCOCO), accuracy CPU - 90 Watt - FP32 (Intel Core i7-6700K 4GHz 8 Logical Cores) OpenCV-DLIE, FPS VPU - 2 Watt - FP16 (Intel Myriad X) OpenCV-DLIE, FPS GPU - 175 Watt - FP32/16 (nVidia GeForce RTX 2070) Darknet-cuDNN, FPS
yolov4-tiny 416x416 40.2% - - 330
yolov3-tiny 416x416 33.1% 35 6.5 340
yolov3-tiny-PRN 416x416 33.1% 46 5.3 370
EfficientNetB0-Yolo 416x416 45.5% 11 - 55
yolov3 416x416 31.0% 55.3% - - -
yolov3-spp 512x512 ~59.6% 3.3 1.1 52
csresnext50-opt 512x512 42.4% 64.4% 3.5 0.64 37
csdarknet53-opt 256x256 async=3 33.3% 53.0% 14 11 74
csdarknet53-opt 512x512 42.4% 64.5% 3.5 1.23 50
csdarknet53-mish 512x512 (YOLOv4) 43.0% 64.9% - - 50
csresnext50-opt 608x608 43.2% 65.4% - - 34
csdarknet53-mish 608x608 (YOLOv4) 43.5% 65.7% - - 37
WongKinYiu commented 4 years ago

@AlexeyAB Hello,

So currently EfficientNetB0-Yolo is the fastest model on VPU?

AlexeyAB commented 4 years ago

@WongKinYiu Hi,

Yes, it seems VPU (Intel Myriad X) is highly optimized for Grouped-convolutional and may be SE-blocks. I will test it more.

Maybe with new Google-Coral-TPU-edge in general, the performance ratio will be the same as with Intel Myriad X.

So maybe it makes sense to train GhostNet ghostnet.cfg.txt and yolov3-tiny-3l-ghostnet (as a new tiny-yolo model): https://github.com/AlexeyAB/darknet/issues/4418#issue-530577441

WongKinYiu commented 4 years ago

@AlexeyAB Thanks,

ghostnet now training 40k/800k iterations.

AlexeyAB commented 4 years ago

@WongKinYiu Do you train ghostnet with CutMix+Mosaic+Label-smoothing?

Also did we get improvement for any network with DropBlock?

LukeAI commented 4 years ago

This is a fantastic resource, if at all possible, it'd be great to also see results for "batch=4" or similar.

WongKinYiu commented 4 years ago

@AlexeyAB No, just ghostnet.cfg.txt your provided before.

AlexeyAB commented 4 years ago

@WongKinYiu I also added https://github.com/AlexeyAB/darknet/blob/master/cfg/efficientnet-lite3.cfg that you can try to train with subdivisions=6 or 4

WongKinYiu commented 4 years ago

@AlexeyAB thanks, i am seeing the code of new commits.

WongKinYiu commented 4 years ago

@AlexeyAB i set subdivisions=4 and the training is start now.

ShaneHsieh commented 4 years ago

Hi @AlexeyAB When you test CPU, VPU , do you use FP32? As far as I know, VPU can use FP16 and Int8. this information is very important.

AlexeyAB commented 4 years ago

@ShaneHsieh I added this information, so CPU uses FP32, VPU uses FP16, GPU uses FP32/16 (Tensor Cores). These devices use the lowest possible precision of floating point values ​​with increasing speed and without loss of accuracy.

ShaneHsieh commented 4 years ago

Thank. Compare CPU and GPU when use FP32 , CPU use EfficientNetB0-Yolo can get better performance. it is good information.

andeyeluguo commented 4 years ago

what does the opencv-DLIE mean?

WongKinYiu commented 4 years ago

OpenCV-DLIE (deep learning Inference Engine), supported by OpenVINO Toolkit.

WongKinYiu commented 4 years ago

Yes, you can use opencv dnn module to run the models. For example, yolov3, yolov3-tiny-prn, efficientnetb0-yolo...

But due to mish activation function and eliminate grid sensitivity not yet supported by opencv dnn module, you can not run yolov4 in this time.

andeyeluguo commented 4 years ago

Does it support alexeyAB's version ?, I now only find the tensorflow's yolo version that OpenVINO support.

WongKinYiu commented 4 years ago

for your reference https://github.com/opencv/opencv/pull/16436

andeyeluguo commented 4 years ago

will you please give me a tutorial of how to deploy the cfg file to xml which OpenVINO supports? I see the question on the site Does OpenCV-OpenVINO version supports Yolo v3 network? It may be asked by alexeyAB.

WongKinYiu commented 4 years ago

Darknet is supported already. https://github.com/opencv/opencv/wiki/Deep-Learning-in-OpenCV

AlexeyAB commented 4 years ago

@andeyeluguo For using Yolo with OpenVINO (on CPU, GPU, VPU, ...) you should

  1. install OpenVINO as usual
  2. install OpenCV with OpenVINO-backend: https://github.com/opencv/opencv/wiki/Intel's-Deep-Learning-Inference-Engine-backend
  3. run yolov3.cfg + yolov3.weights by using OpenCV-dnn https://docs.opencv.org/master/da/d9d/tutorial_dnn_yolo.html examples how to use Yolo

YOLOv4 will be supported for OpenCV+OpenVINO soon: https://github.com/opencv/opencv/issues/17148

I added Yolo v2 to OpenCV 2.5 years ago: https://github.com/opencv/opencv/pull/9705

mmaaz60 commented 4 years ago

Can these models also be run on NCS 2 using the OpenCV DNN module with IE backend?

Luxonis-Brandon commented 4 years ago

@mmaaz60 it seems like that is the case. We will be trying on DepthAI (Myriad X based) shortly and will circle back.

Also @AlexeyAB if you have any instructions on how to use YOLOv4 on VPU, we'd be keen to try them out on DepthAI.

AlexeyAB commented 4 years ago

@Luxonis-Brandon

Current version of YOLOv4 is for Real-time on GPU. Later we will release YOLOv4-VPU for real-time >= 30 FPS on VPU.

modern_gpus


There are two ways to run YOLOv4 on MyriadX:

  1. Support for YOLOv4 in OpenVINO - Wait until it is added to OpenVINO
  2. Support for YOLOv4 in OpenCV-dnn (with OpenVINO IE-backend ) - wait for solving this issue: https://github.com/opencv/opencv/issues/17148

Right now, you can try to use a slightly simpler version of YOLOv4, which is 0.5% worse on VPU Intel MyriadX by using C++ with OpenVINO:

use

AlexeyAB commented 4 years ago

@Luxonis-Brandon

I just tested csdarknet53-opt (YOLOv4 without MISH in cfg set: width=256 height=256 - 33.3% AP | 53.0% AP50) on your DepthAI (Myriad X) device with network resolution 256x256 and async=3 by using OpenCV (OpenVINO IE-backend) and get 11 FPS.

AlexeyAB commented 4 years ago

OpenCV_Vs_TensorRT

ausk commented 4 years ago

OpenCV 4.4.0-pre compiled by self. OpenVino 2020.R3, Myriad. net.setPreferableTarget(cv2.dnn.DNN_TARGET_MYRIAD)

Input 416x416

efficient-b0 395 ms yolov3, 550 ms yolov3-tiny-prn, 168 ms yolov3-tiny, 128 ms yolov4, 940 ms efnet-coco, 395 ms

AlexeyAB commented 4 years ago

YOLOv4-tiny released: https://github.com/AlexeyAB/darknet/issues/6067

linyib commented 6 months ago

Hi, Who has efficientnet-lite3.weights file, can you share it with me?