THUDM / ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Apache License 2.0
40.62k stars 5.21k forks source link

[Help] <size mismatch for embedding.weight>copying a param with shape torch.Size([8, 229376]) from checkpoint, the shape in current model is torch.Size([8, 4096]). #1047

Open MrWuzy1994 opened 1 year ago

MrWuzy1994 commented 1 year ago

Is there an existing issue for this?

Current Behavior

求教!!!!int4模型微调以后跑模型,提示报错 Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /home/luban/chatglm-6b-int4 and are newly initialized: ['transformer.prefix_encoder.trans.0.weight', 'transformer.prefix_encoder.trans.2.weight', 'transformer.prefix_encoder.trans.2.bias', 'transformer.prefix_encoder.embedding.weight', 'transformer.prefix_encoder.trans.0.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "/home/luban/ChatGLM-6B/cli_demo.py", line 18, in model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict) File "/home/luban/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for PrefixEncoder: Missing key(s) in state_dict: "trans.0.weight", "trans.0.bias", "trans.2.weight", "trans.2.bias". size mismatch for embedding.weight: copying a param with shape torch.Size([8, 229376]) from checkpoint, the shape in current model is torch.Size([8, 4096]).

Expected Behavior

No response

Steps To Reproduce

微调以后跑模型

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

teanon commented 1 year ago

我的理解是推理时的有些参数(如source_prefix 和prefix_projection)需要和训练时保持一致 config = AutoConfig.from_pretrained(path, trust_remote_code=True, pre_seq_len=128, source_prefix='如果你有prefix 就把它加上吧', prefix_projection=True)

CyanMystery commented 1 year ago

我的理解是推理时的有些参数(如source_prefix 和prefix_projection)需要和训练时保持一致 config = AutoConfig.from_pretrained(path, trust_remote_code=True, pre_seq_len=128, source_prefix='如果你有prefix 就把它加上吧', prefix_projection=True)

加上了还是不行啊 我指定了 prefix_projection 和ptuning checkpoint

starevelyn commented 1 year ago

config = AutoConfig.from_pretrained(model_path, trust_remote_code=True, pre_seq_len=576) 这里面的pre_seq_len要和训练模型时的参数一致,试一下