Open May-forever opened 4 years ago
However, if the size of input feature map is W H C (WidthxHeightxChannel), and the parameters
of max-pooling contain F (Filter size), S (stride). The size of the output feature should be:
W = [(W - F )/S] +1 and H = [(H - F )/S] +1
On the above basis, the output size of 78 layer of Yolov3-SPP.cfg should be:
W=14= [(19 - 5)/1] +1 and H=14= [(19 - 5)/1], i.e., it should be 14x14x512
Why do you think so? Should be: https://github.com/AlexeyAB/darknet/blob/c7e3ba3ed41e9fd114390263f9dd1657b71f676c/src/maxpool_layer.c#L78-L80
However, if the size of input feature map is W H C (WidthxHeightxChannel), and the parameters of max-pooling contain F (Filter size), S (stride). The size of the output feature should be: W = [(W - F )/S] +1 and H = [(H - F )/S] +1 On the above basis, the output size of 78 layer of Yolov3-SPP.cfg should be: W=14= [(19 - 5)/1] +1 and H=14= [(19 - 5)/1], i.e., it should be 14x14x512
Why do you think so? Should be: https://github.com/AlexeyAB/darknet/blob/c7e3ba3ed41e9fd114390263f9dd1657b71f676c/src/maxpool_layer.c#L78-L80
Hi @AlexeyAB ,
Thank you very much for your reply.
If l.out_w = (w + padding - size) / stride_x + 1, and 'size' indicates the pooling kernal size.
the value of padding should always be: padding= size-1.
i.e., W=19=[(19+4-5)/1]+1.
Am I right ?
Looking forward to hearing from you, than you very much.
the value of padding should always be: padding= size-1.
Yes, so
l.out_w = (w + padding - size) / stride_x + 1 = (w + size - 1 - size) / 1 + 1 = (w-1) + 1 = w
So output_w == input_w
the value of padding should always be: padding= size-1.
Yes, so
l.out_w = (w + padding - size) / stride_x + 1 = (w + size - 1 - size) / 1 + 1 = (w-1) + 1 = w
So output_w == input_w
Thank you very much for your help
Also read about 2 types of Padding: SAME and VALID https://stackoverflow.com/questions/37674306/what-is-the-difference-between-same-and-valid-padding-in-tf-nn-max-pool-of-t
Also read about 2 types of Padding: SAME and VALID https://stackoverflow.com/questions/37674306/what-is-the-difference-between-same-and-valid-padding-in-tf-nn-max-pool-of-t
Ok, thank you very much.
Also read about 2 types of Padding: SAME and VALID https://stackoverflow.com/questions/37674306/what-is-the-difference-between-same-and-valid-padding-in-tf-nn-max-pool-of-t
Hi,In the bottom lines of Page 4 of 'SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications', it claims 'SPP module is able to extract multiscale deep features with different receptive fields and fuse them by concatenating them in the channel dimension of feature maps.'. However, when I train yolov3-spp, the output of each max-pooling layer in SPP structure is same, which means the receptive fields of all the outputs of max-pooling layer in SPP is the same size. So I am so confused with the sentence 'with different receptive fields'. Could you please give me some guidance or inspiration?Thanks~
@Olivia-V
Receptieve field of each output of maxpool 2 x 2
is 2x2 pixels
Receptieve field of each output of maxpool 7 x 7
is 7x7 pixels
@Olivia-V Receptieve field of each output of
maxpool 2 x 2
is 2x2 pixels Receptieve field of each output ofmaxpool 7 x 7
is 7x7 pixels
Hi, @AlexeyAB , thanks a lot. However, for example, the input of the 80 layer in yolov3-spp is 19 x 19 x 512, and the output of the 80 layer in yolov3-spp is still 19 x 19 x 512. According to my understanding, because each point in the output feature map is corresponding to each point in the input feature map, the receptive field should be 1x1 pixels. but according to your inspiration, the receptive field should be 9 x 9. That means the padding pixels are not been included in the calculation of receptive field.
Am I right or wrong ? If I am wrong, please point out. Thanks in advance and merry Christmas. :)
78 max 5 x 5/ 1 19 x 19 x 512 -> 19 x 19 x 512 0.005 BF 79 route 77 80 max 9 x 9/ 1 19 x 19 x 512 -> 19 x 19 x 512 0.015 BF 81 route 77 82 max 13 x 13/ 1 19 x 19 x 512 -> 19 x 19 x 512 0.031 BF 83 route 82 80 78 77
@Olivia-V
According to my understanding, because each point in the output feature map is corresponding to each point in the input feature map,
Why do you think so?
What does it mean 5x5 in max 5x5
?
What does it mean 5x5 in conv 5x5
?
If during training, 24 weights became 0
except 1 weight that is equal 0.5
, then what is the receptieve field of dw-conv 5x5
?
What is the receptieve field of conv 5x5
? Are you sure that all 25 weights are not zero?
@Olivia-V
According to my understanding, because each point in the output feature map is corresponding to each point in the input feature map,
Why do you think so? What does it mean 5x5 in
max 5x5
? What does it mean 5x5 inconv 5x5
? If during training, 24 weights became0
except 1 weight that is equal0.5
, then what is the receptieve field ofdw-conv 5x5
? What is the receptieve field ofconv 5x5
? Are you sure that all 25 weights are not zero?
Oh, yes! you are right, thank you very much for your help.
Hi @AlexeyAB ,
When I use Yolov3-SPP.cfg for trainning my custom dataset, I find a strange thing.
In the 78 layer of Yolov3-SPP.cfg (i.e., the first max-pooling layer of SPP), the size of input feature
map is 19 x 19 x 512 . By using the 5 x 5/ 1 max-pooling operation, the output size is still 19 x 19
x 512.
However, if the size of input feature map is W H C (WidthxHeightxChannel), and the parameters
of max-pooling contain F (Filter size), S (stride). The size of the output feature should be:
W = [(W - F )/S] +1 and H = [(H - F )/S] +1
On the above basis, the output size of 78 layer of Yolov3-SPP.cfg should be:
W=14= [(19 - 5)/1] +1 and H=14= [(19 - 5)/1], i.e., it should be 14x14x512
Could you please give me some help for my understanding about why there are some differences in
the output size of 78 layer?
Looking forward to hearing from you, thanks a lot in advance.
****Below is the details of 78 layer to 83 layer of Yolov3-SPP.cfg** 78 max 5 x 5/ 1 19 x 19 x 512 -> 19 x 19 x 512 0.005 BF 79 route 77 80 max 9 x 9/ 1 19 x 19 x 512 -> 19 x 19 x 512 0.015 BF 81 route 77 82 max 13 x 13/ 1 19 x 19 x 512 -> 19 x 19 x 512 0.031 BF 83 route 82 80 78 77