Hi,
Great work! I was just wondering how important is the imagenet pre-training in Table-1. What happens if we pre-train DETReg from scratch on MS-COCO train 2017 set and use that to compare against methods (pre-trained on ImageNet) in Table-1? Did you try that experiment? Is it worse than pre-training on ImageNet 1k? Is it because of fewer images or is it because of bad region proposals on scene centric datasets like COCO?
Thanks
Hi, Great work! I was just wondering how important is the imagenet pre-training in Table-1. What happens if we pre-train DETReg from scratch on MS-COCO train 2017 set and use that to compare against methods (pre-trained on ImageNet) in Table-1? Did you try that experiment? Is it worse than pre-training on ImageNet 1k? Is it because of fewer images or is it because of bad region proposals on scene centric datasets like COCO? Thanks