RuntimeError: Error(s) in loading state_dict for LLaMAForCausalLM: Unexpected key(s) in state_dict:

when I run "python ./pyllama/quant_infer.py --wbits 4 --load ./output/654b.pt --text "what is life" --max_length 24 --cuda cuda:0" , I got the error below. I converted llama 65B into 4int, with the file 654b.pt. Everything has followed the instruction here, anyone could help? ++++++++++ RuntimeError: Error(s) in loading state_dict for LLaMAForCausalLM: Unexpected key(s) in state_dict: "model.layers.32.self_attn.q_proj.zeros", "model.layers.32.self_attn.q_proj.scales", "model.layers.32.self_attn.q_proj.bias", "model.layers.32.self_attn.q_proj.qweight", "model.layers.32.self_attn.k_proj.zeros", "model.layers.32.self_attn.k_proj.scales", "model.layers.32.self_attn.k_proj.bias", "model.layers.32.self_attn.k_proj.qweight", "model.layers.32.self_attn.v_proj.zeros", "model.layers.32.self_attn.v_proj.scales", "model.layers.32.self_attn.v_proj.bias", "model.layers.32.self_attn.v_proj.qweight", "model.layers.32.self_attn.o_proj.zeros", "model.layers.32.self_attn.o_proj.scales", "model.layers.32.self_attn.o_proj.bias", "model.layers.32.self_attn.o_proj.qweight", "model.layers.32.self_attn.rotary_emb.inv_freq", "model.layers.32.mlp.gate_proj.zeros", "model.layers.32.mlp.gate_proj.scales", "model.layers.32.mlp.gate_proj.bias", "model.layers.32.mlp.gate_proj.qweight", "model.layers.32.mlp.down_proj.zeros", "model.layers.32.mlp.down_proj.scales", "model.layers.32.mlp.down_proj.bias", "model.layers.32.mlp.down_proj.qweight", "model.layers.32.mlp.up_proj.zeros", "model.layers.32.mlp.up_proj.scales", "model.layers.32.mlp.up_proj.bias", "model.layers.32.mlp.up_proj.qweight", "model.layers.32.input_layernorm.weight", "model.layers.32.post_attention_layernorm.weight", "model.layers.33.self_attn.q_proj.zeros", "model.layers.33.self_attn.q_proj.scales", ...... ...... size mismatch for model.layers.31.mlp.up_proj.scales: copying a param with shape torch.Size([22016, 1]) from checkpoint, the shape in current model is torch.Size([11008, 1]). size mismatch for model.layers.31.mlp.up_proj.bias: copying a param with shape torch.Size([22016]) from checkpoint, the shape in current model is torch.Size([11008]). size mismatch for model.layers.31.mlp.up_proj.qweight: copying a param with shape torch.Size([1024, 22016]) from checkpoint, the shape in current model is torch.Size([512, 11008]). size mismatch for model.layers.31.input_layernorm.weight: copying a param with shape torch.Size([8192]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for model.layers.31.post_attention_layernorm.weight: copying a param with shape torch.Size([8192]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for model.norm.weight: copying a param with shape torch.Size([8192]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for lm_head.weight: copying a param with shape torch.Size([32000, 8192]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

juncongmoo / pyllama

RuntimeError: Error(s) in loading state_dict for LLaMAForCausalLM: Unexpected key(s) in state_dict: #102