Open Sahajtomar opened 1 year ago
I got the same error, any workarounds?
RuntimeError: Error(s) in loading state_dict for T5ForConditionalGeneration: size mismatch for shared.weight: copying a param with shape torch.Size([32100, 768]) from checkpoint, the shape in current model is torch.Size([32128, 768]). size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([32100, 768]) from checkpoint, the shape in current model is torch.Size([32128, 768]). size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([32100, 768]) from checkpoint, the shape in current model is torch.Size([32128, 768]). size mismatch for lm_head.weight: copying a param with shape torch.Size([32100, 768]) from checkpoint, the shape in current model is torch.Size([32128, 768]).
I am trying to load t5 base model as per t5_ppo config. Strangely this error pops out. Works fine for t5-small.