Reason for choosing the maxpool (size = 2, stride=1) in yolov3-tiny?

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Other

21.8k stars 7.97k forks source link

stride=2 in any layer (maxpool, convolutional, reorg, ...) is for reducing the feature map size
[maxpool] layer is for reducing the spatial dependence of features (it doesn’t matter which of the 4 cells contains the maximum feature) - this allows you to detect compressed / stretched / elastic / slightly rotated objects

This is a smaller version of SPP-block: You can read about Spatial Pyramid Pooling: https://arxiv.org/abs/1406.4729v4

52151356-e5d4a380-2683-11e9-9d7d-ac7bc192c477

AlexeyAB / darknet