There are mismatch errors when loading the pretrained weight: sam_vit_l_0b3195.pth. So, the hyper-parameter in demo.yaml should be wrong.
I have found an error: the embed_dim should be 1024, instead of 1280. After correcting it, I also got few mismatch errors as following:
RuntimeError: Error(s) in loading state_dict for SAM: size mismatch for image_encoder.blocks.5.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.5.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.7.attn.rel_pos_h: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.7.attn.rel_pos_w: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.11.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.11.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.15.attn.rel_pos_h: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.15.attn.rel_pos_w: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.17.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.17.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]).
So, could you provide a correct yaml file? Thanks a lot!
There are mismatch errors when loading the pretrained weight: sam_vit_l_0b3195.pth. So, the hyper-parameter in demo.yaml should be wrong.
I have found an error: the embed_dim should be 1024, instead of 1280. After correcting it, I also got few mismatch errors as following:
RuntimeError: Error(s) in loading state_dict for SAM: size mismatch for image_encoder.blocks.5.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.5.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.7.attn.rel_pos_h: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.7.attn.rel_pos_w: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.11.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.11.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.15.attn.rel_pos_h: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.15.attn.rel_pos_w: copying a param with shape torch.Size([27, 64]) from checkpoint, the shape in current model is torch.Size([127, 64]). size mismatch for image_encoder.blocks.17.attn.rel_pos_h: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]). size mismatch for image_encoder.blocks.17.attn.rel_pos_w: copying a param with shape torch.Size([127, 64]) from checkpoint, the shape in current model is torch.Size([27, 64]).
So, could you provide a correct yaml file? Thanks a lot!