Closed gshuangchun closed 4 months ago
同问
the same question
I solved it, add --use_ln because you are using layernorm
Dear user,
Information about matching the right weights and models is provided in the README.md file. The weight filenames are descriptive about their respective configs.
Thanks
The link to low res weights for the pytorch Cholect45 (crossval k1) seems to not match the model (https://s3.unistra.fr/camma_public/github/rendezvous/rendezvous_l8_cholect45_crossval_k1_layernorm_lowres.pth):
size mismatch for decoder.mhma.0.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.0.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.1.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.1.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.2.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.2.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.3.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.3.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.4.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.4.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.5.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.5.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.6.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.6.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.7.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.mhma.7.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.0.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.0.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.1.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.1.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.2.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.2.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.3.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.3.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.4.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.4.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.5.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.5.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.6.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.6.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.7.ln.weight: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]). size mismatch for decoder.ffnet.7.ln.bias: copying a param with shape torch.Size([100, 8, 14]) from checkpoint, the shape in current model is torch.Size([100]).