size mismatch for model

H-0906 commented 1 month ago

模型下载完成之后，运行出现如下报错： RuntimeError: Error(s) in loading state_dict for UniMERModel: size mismatch for model.model.decoder.model.decoder.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.0.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.0.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.0.encoder_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.0.encoder_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.0.encoder_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.0.encoder_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.1.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.1.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.1.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.1.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.1.encoder_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.1.encoder_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.1.encoder_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.1.encoder_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.2.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.2.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.2.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.2.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.2.encoder_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.2.encoder_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.2.encoder_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.2.encoder_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.3.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.3.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.3.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.3.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.3.encoder_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.3.encoder_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.3.encoder_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.3.encoder_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.4.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.4.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.4.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.4.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.4.encoder_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.4.encoder_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.4.encoder_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.4.encoder_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.5.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.5.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.5.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.5.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.5.encoder_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.5.encoder_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.5.encoder_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.5.encoder_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.6.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.6.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.6.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.6.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.6.encoder_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.6.encoder_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.6.encoder_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.6.encoder_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.7.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.7.self_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.7.self_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.7.self_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.7.encoder_attn.k_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.7.encoder_attn.k_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for model.model.decoder.model.decoder.layers.7.encoder_attn.q_proj.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 1024]). size mismatch for model.model.decoder.model.decoder.layers.7.encoder_attn.q_proj.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). 请问这样的问题怎么解决？

lllfx commented 1 month ago

I encountered the same problem

r4v3nz commented 1 month ago

same here (using colab)

jcui2001 commented 1 month ago

I have the same problem. I think I can add arguments for num_attention_heads and head_size which should ber multiplied to change the param shape?

per this post: https://github.com/Lightning-AI/litgpt/issues/1289

hjbiao09 commented 1 month ago

A large amount of code has changed in the latest unimernet version (0.2.0) (updated on 9/6) The best way seems to be to change the model config directly to 1024 --> 512, but there are so many models that it's hard to understand.

So, you can simply pip install unimernet==0.1.6 to downgrade to the previous version.

H-0906 commented 1 month ago

OK! I have installed unimernet 0.1.6 and this problem has been handled. Thank you!

opendatalab / PDF-Extract-Kit

size mismatch for model #117