raoyongming / DenseCLIP

[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
520 stars 40 forks source link

dimension error when load ViT-B weight #19

Closed xuzhang1199 closed 2 years ago

xuzhang1199 commented 2 years ago

for config denseclip_fpn_vit-b_640x640_80k.py: in text_encoder: embed_dim=512, while ViT-B-16.pt has embed_dim=1024, when loading weight, it turns out that :"RuntimeError: Error(s) in loading state_dict for CLIPTextContextEncoder: size mismatch for text_projection: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in curren t model is torch.Size([512, 512])" How do you deal with this problem?

xuzhang1199 commented 2 years ago

sorry, I load a wrong weight