tianrun-chen / SAM-Adapter-PyTorch

Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts
MIT License
968 stars 83 forks source link

Mismatching erros #2

Closed hwei-hw closed 1 year ago

hwei-hw commented 1 year ago

There are mismatch errors when loading the pretrained weight: sam_vit_l_0b3195.pth. So, the hyper-parameter in demo.yaml should be wrong.

I have found an error: the embed_dim should be 1024, instead of 1280. After correcting it, I also got few mismatch errors as following:

RuntimeError: Error(s) in loading state_dict for SAM: size mismatch for image_encoder.blocks.5.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.5.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.7.attn.rel_pos_h: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.7.attn.rel_pos_w: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.11.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.11.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.15.attn.rel_pos_h: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.15.attn.rel_pos_w: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.17.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.17.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]).

So, could you provide a correct yaml file? Thanks a lot!

littletomatodonkey commented 1 year ago

you should use the vit-h prertained weights rather than vit-l weights provided by the repo.

tianrun-chen commented 1 year ago

Thanks for your interests! Please use the ViT-H weights as noted in our manuscript.

hwei-hw commented 1 year ago

Ok, thanks for your reply!