Closed liyih closed 9 months ago
Thanks for your interest.
the Swin-T backbone and FPN are further finetuned by us on nuImages from the COCO pretrained weights (https://download.openmmlab.com/mmdetection/v2.0/swin/mask_rcnn_swin-t-p4-w7_fpn_1x_coco/mask_rcnn_swin-t-p4-w7_fpn_1x_coco_20210902_120937-9d6b7cfa.pth). All the finetuning settings (input size, lr, epochs) are same as https://github.com/open-mmlab/mmdetection3d/blob/main/configs/nuimages/mask-rcnn_r50_fpn_1x_nuim.py, and you just need to change the backbone from R50 to Swin-T.
Thanks for your reply!
In Swin-T based Sparsefusion the input size of image is 448×800. But in mask-rcnn_r50_fpn_1x_nuim.py the size is not that, so do I need to use size 448×800?
That does not matter. You do not need to resize.
Hi yicheng, Thanks for your awesome work. I want to know how do you pretrain the swin-T based camera branch. Could you give me some detail information, like what dataset it uses (imagenet or nuimage)? What more, I want to know the input scale of image during pretrain. If you could share me with your code of pretrain it would be better. Thanks! Best