没有SFT的话推理会抱错，麻烦看看

hopeforus commented 10 months ago

Traceback (most recent call last): File "/home/hope/work/baby-llama2-chinese/eval_hope.py", line 67, in model.load_state_dict(state_dict, strict=False) File "/home/hope/miniconda3/envs/llama2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Transformer: size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([64793, 1024]) from checkpoint, the shape in current model is torch.Size([64793, 512]).

DLLXW commented 10 months ago

Traceback (most recent call last): File "/home/hope/work/baby-llama2-chinese/eval_hope.py", line 67, in model.load_state_dict(state_dict, strict=False) File "/home/hope/miniconda3/envs/llama2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for Transformer: size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([64793, 1024]) from checkpoint, the shape in current model is torch.Size([64793, 512]).

这个是你代码里面设置的维度512和保存的模型的维度1024对不上，你把代码里面的dim 改成 1024即可

hopeforus commented 10 months ago

好的我试一下多谢啦

DLLXW / baby-llama2-chinese

没有SFT的话推理会抱错，麻烦看看 #30

DLLXW / baby-llama2-chinese

没有SFT的话 推理会抱错，麻烦看看 #30

没有SFT的话推理会抱错，麻烦看看 #30