Why not train on COCO and fine-tune on Objects365?

lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Apache License 2.0

2.31k stars 258 forks source link

Open Henistein opened 1 month ago

Henistein commented 1 month ago

As stated here RT-DETR-{R18,R50,R101} were trained on Objects365 and fine-tuned on COCO dataset. Why not the opposite?

I am curious because since you already had the pretrained models on COCO why not just fine-tune on Objects365. Is there a reason for that?

Thank you in advance!

Zoxive commented 1 month ago

My guess is because more data always seems to be better.

So they trained on the dataset with more data (Objects365 = 2,000k images), and then fine tuned it on the smaller dataset (COCO = 328k images).

Edit: Just found this in the original paper:

We pre-train RT-DETR on the larger Objects365[35] dataset and then fine-tune it on COCO to achieve higher performance.