AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

A possible error regarding channel number in the 1st stage #5466

Closed jasw1001 closed 4 years ago

jasw1001 commented 4 years ago

https://github.com/AlexeyAB/darknet/blob/6cbb75d10b43a95f11326a2475d64500b11fa64e/cfg/yolov4-custom.cfg#L50

https://github.com/AlexeyAB/darknet/blob/6cbb75d10b43a95f11326a2475d64500b11fa64e/cfg/yolov4-custom.cfg#L61

https://github.com/AlexeyAB/darknet/blob/6cbb75d10b43a95f11326a2475d64500b11fa64e/cfg/yolov4-custom.cfg#L77

https://github.com/AlexeyAB/darknet/blob/6cbb75d10b43a95f11326a2475d64500b11fa64e/cfg/yolov4-custom.cfg#L89

I guess that the number of channels (filters) mentioned in the above four lines should be 32. So at line 95 the routine layer can concatenate two 32 channel layer togeter to a 64-channel layer.

AlexeyAB commented 4 years ago

Why it can not concatenate two 64 channel layer togeter to a 128-channel layer?

filters=64 isn't number of input channels, this is number of filters (and number of output channels) https://github.com/AlexeyAB/darknet/blob/6cbb75d10b43a95f11326a2475d64500b11fa64e/cfg/yolov4-custom.cfg#L95-L104

jasw1001 commented 4 years ago

In Line 42, the filters are set as 64, and then according to the CSP rule, two 32-channel paths should be splited. One goes directly to the end of this stage Line 95, and the other goes to a regular darknet block starting from Line 59.

So the channel number should be like the following:

  1. Downsample: N * 2
  2. Path A: N
  3. Path B: N Path B: ResNet blocks
  4. concatenation (route): N + N = 2N
  5. Transition: 2N
  6. Next downsample: 2* 2N
  7. ... ...

The 2nd stage has a correct channel numbers, starting from Line 106 which you may refer to.

Jasper

WongKinYiu commented 4 years ago

CSPNet could be:

  1. Downsample: k_1
  2. Path A: alpha * k_1
  3. Path B: (1-alpha) * k_1 Path B: {ResNet, ResNeXt, Densenet, whatevernet} blocks, c_1 Path B: Transition, c_1'
  4. concatenation (route): alpha * k_1 + c_1'
  5. Transition: k_1'
  6. Next downsample: k_2
  7. ... ...
AlexeyAB commented 4 years ago

Oh, are you talking about strict compliance with the CSP architecture?

  1. We can optimize and change CSP architecture

Downsample: N * 2

  1. There can be any N in the Downsample layer: N/2, N, N*2, ...

  2. There are 2 completely equivalent ways - 100% equivalent: weights, inputs and output results - we use (a):

image

jasw1001 commented 4 years ago

@WongKinYiu I find similar lines in your repo

https://github.com/WongKinYiu/CrossStagePartialNetworks/blob/520c1828fc4373c242773c56543d0744426ab9f5/cfg/csdarknet53.cfg#L51

Could you tell me the idea beneath the setting of channels? And, I find that @WongKinYiu in your cfg file, only the first stage does not follow the CSP rule strickly. Why?

WongKinYiu commented 4 years ago

@jasw1001

the channel number of transition layers and down-sampling layers are not included in the rule. as @AlexeyAB says https://github.com/AlexeyAB/darknet/issues/5466#issuecomment-623130824 we use (a) in this implementation

[convolutional]
batch_normalize=1
filters=64 # <- N/2 (partial 1)
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=64 # <- N/2 (partial 2)
size=1
stride=1
pad=1
activation=mish

and

[convolutional]
batch_normalize=1
filters=64 # <- N/2 (partial transition)
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-7 # <- N (merge)

it follow same rule as other stages.