rror occurred when executing T5TextEncoderLoader #ELLA: Error(s) in loading state_dict for T5EncoderModel: size mismatch for shared.weight: copying a param with shape torch.Size([32128, 2048]) from checkpoint, the shape in current model is torch.Size([32128, 512]). size mismatch for encoder.block.0.layer.0.SelfAttention.q.weight: copying a param with shape torch.Size([2048, 2048]) from checkpoint, the shape in current model is torch.Size([512, 512]). #48
rror occurred when executing T5TextEncoderLoader #ELLA:
Error(s) in loading state_dict for T5EncoderModel:
size mismatch for shared.weight: copying a param with shape torch.Size([32128, 2048]) from checkpoint, the shape in current model is torch.Size([32128, 512]).
size mismatch for encoder.block.0.layer.0.SelfAttention.q.weight: copying a param with shape torch.Size([2048, 2048]) from checkpoint, the shape in current model is torch.Size([512, 512]).
rror occurred when executing T5TextEncoderLoader #ELLA:
Error(s) in loading state_dict for T5EncoderModel: size mismatch for shared.weight: copying a param with shape torch.Size([32128, 2048]) from checkpoint, the shape in current model is torch.Size([32128, 512]). size mismatch for encoder.block.0.layer.0.SelfAttention.q.weight: copying a param with shape torch.Size([2048, 2048]) from checkpoint, the shape in current model is torch.Size([512, 512]).