Torch Size Mismatch - Githubissues

Aleph-Alpha / magma

MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multilingual models from Aleph Alpha check out our website https://app.aleph-alpha.com

MIT License

469 stars 55 forks source link

Hey guys!

I had a quick issue while loading Magma from the checkpoint, and I was wondering if anyone encountered or knows how to solve the problem.

RuntimeError: Error(s) in loading state_dict for Magma: size mismatch for lm.lm_head.weight: copying a param with shape torch.Size([50400, 4096]) from checkpoint, the shape in current model is torch.Size([50258, 4096]).

It seems like the size of the checkpoint model differs from the size of the model it is expecting from the rest of the code.

Thank you so much--this model looks super cool and I'm excited to use it!

Aleph-Alpha / magma

Torch Size Mismatch #34