IDEA-Research / DINO

[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Apache License 2.0
2.1k stars 230 forks source link

Increasing batch size does not increase inference speed per image, and negatively impacts mAP #212

Closed InternetXplorer closed 11 months ago

InternetXplorer commented 11 months ago

Hello, As stated in the title, I have noticed that batch size doesn't have the expected effect when using your code, with Resnet50, swin, and focalnet backbones and DINO_5scale as head. When running evaluation with batch size > 1, the mAP decreases (on a custom dataset, with batch size = 8 the mAP drops by as much as 0.08, w.r.t. mAP with bs=1), and the average inference time / image increases, which is the opposite of what should be measured. Are these two issues already known ? Is there a way to fix them ? Because otherwise I don't understand how can training be possible, even on very small datasets it is way too slow. Thank you

InternetXplorer commented 11 months ago

I saw on your README that the DINO_scale5 have been trained with only 1 image per GPU, so that explains the drop in mAP (the model has never seen padding during training, hence the impact when I try to batch > 1). However I still don't understand the issue with speed, that should be greater with increased batchsize.