zylo117 / Yet-Another-EfficientDet-Pytorch

The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.
GNU Lesser General Public License v3.0
5.21k stars 1.27k forks source link

Small Object Detect Issue #647

Closed WindVChen closed 3 years ago

WindVChen commented 3 years ago

The work is very amazing! But I meet some problems on detecting small objects.

I have read almost all of the issues about detecting small objects, and they cannot deal with my doubts.

My object's size is 15×15~20×20, smaller than COCOTiny, therefore I change the 'anchor_scales' into '[2 -4, 2 -2, 2 * -1]'. As the anchor size calculate is "scale pyramid_level * base_scale", the anchor size is between 2 to 256, so I think it's enough to meet my object size. Considering the default ratio setting may effect the match of gtbox and anchorbox, I change the IoU setting to 0.2.

I have tried D0/D2/D4, each I train 1000 epoches (Yes, 1000). For each 10 epoches, I valuate my valuation dataset, but the best mAP is low, 43/56/55 respectively (D4 is worse than D2). The best result appears about 150 epoches.

I think the net is converged, since the training loss can fall down to about 0.3 in D0/D2, and about 0.005 in D4.

I also tried Retinanet, and get 73 mAP. As EfficientDet and Retinanet both use Focal Loss and FPN/BiFPN, I think EfficientDet shoud not weak than Retinanet. I'm very confused about that.

Appreciating any help and advice!

zylo117 commented 3 years ago

I think you should use default anchor_scale, but also modify the code to use higher resolution

WindVChen commented 3 years ago

@zylo117 Thanks for your reply. I did have tried the default anchor_scale, which only got less than 20mAP, so I think it's not suitble for my object size. About "use higher resolution", do you mean changing the input of D0 from "512" to higher resolution for example? As my image size is already 512×512, I think the default resolution setting is reasonable.

Is there any other suggestion? Or can you give me some possible reasons why the EfficientDet is weaker than the Retinanet on small objects? Thanks a lot.

WindVChen commented 3 years ago

@zylo117 I try training with pretrained-weights again, and the result seems back to track, about 72mAP for D0. It seems the pretrained-weight is important for training custom datasets, though my data is totally different with nature images.

zylo117 commented 3 years ago

so you was training without pretrained weights? You shouldn't have done that, not unless you have a better and larger dataset like coco.

WindVChen commented 3 years ago

@zylo117 Yes, I will pay attention next time. Another question is that the detection heads in the paper are weight-shared, but that in your code are independent of each other. Will these two methods have a big difference in the final result?

zylo117 commented 3 years ago

mine is also weight-shared. By the shared weights, they mean the shared BN layer for every FPN outputs and shared CONV layer for every regression layer. https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch/blob/15403b5371a64defb2a7c74e162c6e880a7f462c/efficientdet/model.py#L363-L364

zylo117 commented 3 years ago

I think the performance should be a little bit better if not using shared weights but the weights will be larger.

WindVChen commented 3 years ago

Sorry, I may read it wrong. I will close this issue, thank you again for your reply.