Weights not matched - Githubissues

gshuangchun commented 12 months ago

The link to low res weights for the pytorch Cholect45 (crossval k1) seems to not match the model (https://s3.unistra.fr/camma_public/github/rendezvous/rendezvous_l8_cholect45_crossval_k1_layernorm_lowres.pth):

size mismatch for decoder.mhma.0.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.0.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.1.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.1.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.2.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.2.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.3.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.3.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.4.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.4.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.5.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.5.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.6.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.6.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.7.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.7.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.0.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.0.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.1.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.1.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.2.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.2.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.3.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.3.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.4.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.4.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.5.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.5.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.6.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.6.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.7.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.7.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).

Esther-qian commented 10 months ago

同问

sabinakaminska95 commented 4 months ago

the same question

sabinakaminska95 commented 4 months ago

I solved it, add --use_ln because you are using layernorm

nwoyecid commented 4 months ago

Dear user,

Information about matching the right weights and models is provided in the README.md file. The weight filenames are descriptive about their respective configs.

Thanks

CAMMA-public / rendezvous

Weights not matched #21