AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.68k stars 7.96k forks source link

Some questions about scaled yolov4 article #7041

Closed kadirbeytorun closed 3 years ago

kadirbeytorun commented 3 years ago

Hello Alexey and Wang,

I was examining the yolov4 scaled article, I got some questions that I couldn't find the answers to

In second page first paragrapgh, you mentioned that "The Depth of cspdarknet53= 65", "bottleneck ratio =1", "growth ratio between stages=2" I am quite confused about all these expressions. Firsty, I thought base network of yolov4 which is essentially cspdarknet53 has 137 layers if I am not mistaken? Secondly, how do you conclude bottleneck ratio(bottleneck layer channel/ output channel) to be 1? I examined all layers, but couldn't decide which one is the bottleneck layer(with least number of nodes)

Thirdly, what do you mean by growth ratio between stages? Is it channel growth ratio between each Cross Stage Partial block?

Thanks in advance

WongKinYiu commented 3 years ago
  1. depth of a cnn usually means the number of convolutional layer of cnn. 65 of 137 are convolutional layers, others are route, shortcut, ... etc.
  2. bottleneck ratio means max_channel/min_channel in a residual layer. for example, resnet:4, darknet53:2, cspdarknet53:1.
  3. growth rate means the ratio of channels between two continuous stage. the bound of a stage usually be a down-sampling or up-sampling block.

for more details, you can refer to regnet paper.

kadirbeytorun commented 3 years ago
  1. Makes sense, so we don't count layers that are not computationally expensive, thus we only count convolutional depth

  2. Now that you mention, I observed that channel number changes only between csp'sized residual layers

  3. I see it now. When its downsampled to its half, channel is doubled. But when it's upsampled, it is divided in half. Can I come to the conclusion that when its upsampled, it's divided in reverse?

Many thanks again, I really enjoyed reading your article and playing around with new network design