It seems that various backbones obviously affect the final MAP score.
Since ConvNeXt-XL surpass Swin-L on ImageNet-22k, it is curious that have you ever tried to utilize it for improving.
I notice that your code includes ConvNeXt-XL indeed.
Could you offer some details about the MAP score on COCO with DINO ConvNeXt-XL.
In our preliminary experiments, SwinL outperforms ConvNex. But we did not tune the hyperparameters, like the weight decay. Therefore, we are not sure about the performance of ConvNext.
It seems that various backbones obviously affect the final MAP score. Since ConvNeXt-XL surpass Swin-L on ImageNet-22k, it is curious that have you ever tried to utilize it for improving. I notice that your code includes ConvNeXt-XL indeed. Could you offer some details about the MAP score on COCO with DINO ConvNeXt-XL.
It is so appreciated!