完全按照 Readme 流程做微调,处理完数据之后,执行脚本:bash CeMAT_plugins/task_NMT_cemat.sh 进行微调。
报错信息:
RuntimeError: Error(s) in loading state_dict for BiTransformerModel:
size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([64905, 1024]) from checkpoint, the shape in current model is torch.Size([250035, 1024]).
size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([64905, 1024]) from checkpoint, the shape in current model is torch.Size([250035, 1024]).
size mismatch for decoder.output_projection.weight: copying a param with shape torch.Size([64905, 1024]) from checkpoint, the shape in current model is torch.Size([250035, 1024]).
请问应该如何解决?
完全按照 Readme 流程做微调,处理完数据之后,执行脚本:bash CeMAT_plugins/task_NMT_cemat.sh 进行微调。 报错信息: RuntimeError: Error(s) in loading state_dict for BiTransformerModel: size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([64905, 1024]) from checkpoint, the shape in current model is torch.Size([250035, 1024]). size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([64905, 1024]) from checkpoint, the shape in current model is torch.Size([250035, 1024]). size mismatch for decoder.output_projection.weight: copying a param with shape torch.Size([64905, 1024]) from checkpoint, the shape in current model is torch.Size([250035, 1024]). 请问应该如何解决?