How to load the new model weight

TencentARC / LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

Apache License 2.0

482 stars 35 forks source link

I am trying to extend Mistral 7B, I've try the code of extending the model. It's work fine but when I am trying to load the model with the new data, I got an error. The model doesn't know the new block.

when I've try model.load_state_dict(output) where output is the new state_dict from the block expansion script, I got this error.

RuntimeError: Error(s) in loading state_dict for MistralForCausalLM: Unexpected key(s) in state_dict: "model.layers.32.self_attn.q_proj.weight", "model.layers.32.self_attn.k_proj.weight",etc...

can you tell please how did you solve this or you didn't face it when working with llama

TencentARC / LLaMA-Pro

How to load the new model weight #8