Aleph-Alpha / magma

MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multilingual models from Aleph Alpha check out our website https://app.aleph-alpha.com
MIT License
469 stars 55 forks source link

Torch Size Mismatch #34

Closed harshagundala closed 2 years ago

harshagundala commented 2 years ago

Hey guys!

I had a quick issue while loading Magma from the checkpoint, and I was wondering if anyone encountered or knows how to solve the problem.

RuntimeError: Error(s) in loading state_dict for Magma: size mismatch for lm.lm_head.weight: copying a param with shape torch.Size([50400, 4096]) from checkpoint, the shape in current model is torch.Size([50258, 4096]).

It seems like the size of the checkpoint model differs from the size of the model it is expecting from the rest of the code.

Thank you so much--this model looks super cool and I'm excited to use it!

CoEich commented 2 years ago

Hi,

this is likely because you use the wrong transformers version. See https://github.com/Aleph-Alpha/magma/issues/27

Best,

Constantin