AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.56k stars 7.94k forks source link

increase upscaling #8875

Open ravi-lourem opened 6 months ago

ravi-lourem commented 6 months ago

@AlexeyAB If one want to increase upsampling before third yolo detection, then do one has to modify only in upsampling stride ? or else need for any other changes in conv filter. Thanks.. any help is appreciated

rajaththomson commented 5 months ago

To increase the upsampling before third detection, you may need to modify more than just the upsampling stride:

Upsampling Stride: Increasing the upsampling stride will result in a larger feature map but this solution may not alone be sufficient. For instance, changing the stride from 2 to 4 will quadruple the area of the feature map.

Convolutional Filters: you might also need to adjust the convolutional filters that follow, as the size and characteristics of the feature map have changed. You may need to experiment with the number of filters, the size of the filters, and the stride of the convolutional layers that come after the upsampling so that it works effectively with the new upscaled size.

Receptive Field Adjustment: With increased upsampling, the receptive field of the convolutional layers changes. The receptive field is the region in the input space that a particular CNN's feature is looking at. You may need to adjust the kernel size or the architecture of the convolutional layers to maintain an effective receptive field for the detection task.

Anchor Boxes: In YOLO, anchor boxes are predefined bounding boxes used to detect objects. After changing the upsampling scale, you might need to revisit the sizes of these anchor boxes. They should be aligned with the scale of the objects you expect to detect after the upsampling.

Feature Extraction Layers: Consider whether the earlier layers in the network provide sufficient feature extraction for the higher resolution that will result from increased upsampling. You may need to deepen or adjust the network to extract more detailed features.