qianyuzqy / TransVOD_Lite

(TPAMI 2023) TransVOD:End-to-End Video Object Detection with Spatial-Temporal Transformers (implementations of TransVOD Lite).
Apache License 2.0
37 stars 6 forks source link

Coco pretrained model for Swin S backbone #18

Open goku-krish10 opened 1 year ago

goku-krish10 commented 1 year ago

Hello,

The coco pretrained model given in the google drive for the swin S backbone is giving an error. It's the following. Can you please look into it ? Thank you in advance.

RuntimeError: Error(s) in loading state_dict for DeformableDETR: size mismatch for query_embed.weight: copying a param with shape torch.Size([300, 512]) from checkpoint, the shape in current model is torch.Size([100, 512]). size mismatch for backbone.0.body.fpn.inner_blocks.0.weight: copying a param with shape torch.Size([256, 192, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 256, 1, 1]). size mismatch for backbone.0.body.fpn.inner_blocks.1.weight: copying a param with shape torch.Size([256, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]). size mismatch for backbone.0.body.fpn.inner_blocks.2.weight: copying a param with shape torch.Size([256, 768, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).

awadsb1 commented 1 year ago

I have received this issue as well for SwinT:

RuntimeError: Error(s) in loading state_dict for DeformableDETR: size mismatch for query_embed.weight: copying a param with shape torch.Size([300, 512]) from checkpoint, the shape in current model is torch.Size([100, 512]). size mismatch for backbone.0.body.fpn.inner_blocks.0.weight: copying a param with shape torch.Size([256, 192, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 256, 1, 1]). size mismatch for backbone.0.body.fpn.inner_blocks.1.weight: copying a param with shape torch.Size([256, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]). size mismatch for backbone.0.body.fpn.inner_blocks.2.weight: copying a param with shape torch.Size([256, 768, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).

awadsb1 commented 1 year ago

@goku-krish10 I changed this line and it worked for swinT

https://github.com/qianyuzqy/TransVOD_Lite/blob/3825586af70aaa670b3d928987b7f367999e8d76/models/swin_transformer.py#L503

self.fpn = FeaturePyramidNetwork(in_channels_list=[192, 384, 768], out_channels=256)