zjunlp / EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
https://zjunlp.github.io/project/KnowEdit
MIT License
1.74k stars 210 forks source link

How to load SERAC weights #261

Closed Wonder1905 closed 3 months ago

Wonder1905 commented 3 months ago

Hi, Im using the ckpt found in :https://drive.google.com/drive/folders/1rVWN7z05CHkIwbOJ5Cy3E0seWSpU6SeT. And I want to load it so that Ill be able to evaluate the model, my hparams file is: `# Model alg_name: "SERAC" archive: ./results/models/SERAC/llama-2-7b.bk device: 0 model_name: meta-llama/Llama-2-7b-hf

model_class: LlamaForCausalLM small_name: Cheng98/llama-160m tokenizer_class: LlamaTokenizer tokenizer_name: meta-llama/Llama-2-7b-hf cls_name: distilbert/distilbert-base-cased cls_class: AutoModel inner_params: [] model_parallel: false

# Method alg: SERAC lr: 1e-5 edit_lr: 1e-2 seed: 0 lr_lr: 0.0 cedit: 0.1 cloc: 1.0 cbase: 1.0 dropout: 0.0 final_eval: True supervised: false train_base: False no_grad_layers: null soft_weighting: false checkpoint_grad: false cross_attend: false cos: false freeze: null square: true bound_embeds: false use_all_negatives: false freeze_cntr: false dist_heads: 1 lora: null

batch_size: 1 model_save_pt: 500 edit_bs: 1 silent: False

max_epochs: 1

max_iters: 10000 log_interval: 500 val_interval: 500 early_stop_patience: 40000 early_stop_key: "loss/total_edit_val" eval_only: False half: False save: False debug: False log_errors: False unlikelihood: True

val_batch_size: 1 accumulate_bs: 10 val_steps: 500 opt: Adam grad_clip: 100.

# Output results_dir: ./results ` And Im getting an error when loading: "Shouldn't have any unexpected keys" where the unexpected keys are:" ['replacement.model.layers.0.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.1.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.2.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.3.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.4.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.5.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.6.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.7.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.8.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.9.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.10.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.11.self_attn.rotary_emb.inv_freq']"

Any idea what am I missing?

XeeKee commented 3 months ago

It appears that your model is not in the Huggingface format for the LLaMA model. The model weights' name does not match the LLamaForCausalLM class. You need the official LLaMA weights and then convert them to the Huggingface format. I suggest you update the transformers package and re-download the model weights.

Here are the correct names for the Huggingface format of the LLaMA model weights:

'model.layers.29.self_attn.q_proj.weight' 'model.layers.29.self_attn.k_proj.weight' 'model.layers.29.self_attn.v_proj.weight' 'model.layers.29.self_attn.o_proj.weight' 'model.layers.29.self_attn.rotary_emb.inv_freq' 'model.layers.29.mlp.gate_proj.weight' 'model.layers.29.mlp.down_proj.weight' 'model.layers.29.mlp.up_proj.weight' 'model.layers.29.input_layernorm.weight' 'model.layers.29.post_attention_layernorm.weight' 'model.layers.30.self_attn.q_proj.weight' 'model.layers.30.self_attn.k_proj.weight' 'model.layers.30.self_attn.v_proj.weight' 'model.layers.30.self_attn.o_proj.weight' 'model.layers.30.self_attn.rotary_emb.inv_freq' 'model.layers.30.mlp.gate_proj.weight' 'model.layers.30.mlp.down_proj.weight' 'model.layers.30.mlp.up_proj.weight' 'model.layers.30.input_layernorm.weight' 'model.layers.30.post_attention_layernorm.weight' 'model.layers.31.self_attn.q_proj.weight' 'model.layers.31.self_attn.k_proj.weight' 'model.layers.31.self_attn.v_proj.weight' 'model.layers.31.self_attn.o_proj.weight' 'model.layers.31.self_attn.rotary_emb.inv_freq' 'model.layers.31.mlp.gate_proj.weight' 'model.layers.31.mlp.down_proj.weight'

zxlzr commented 3 months ago

Hi, do you have any further questions?

Wonder1905 commented 3 months ago

Im trying to understand, what can be more official than:" meta-llama/Llama-2-7b-hf"

But even If I do get this official version, I dont understand where those: ['replacement.model.layers.0.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.1.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.2.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.3.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.4.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.5.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.6.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.7.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.8.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.9.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.10.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.11.self_attn.rotary_emb.inv_freq']"

Weights are fitting into? they will always be unexcepted, or there is something in the code that add those objects to the model?

XeeKee commented 3 months ago

Hi,

I'm sorry for not expressing myself clearly earlier. I tried to reproduce your issue but couldn't get the same results. Then, I checked the names of my LLaMA 2 model weights and found that they are named like model.layers.29.self_attn.q_proj.weight. So, I suspect that your model might be an earlier version of LLaMA 2 that hasn't been converted to the Hugging Face format, or your transformers package might be outdated. I found a similar issue in this issue.

You can try updating the transformers package first. If this doesn't resolve your issue, please feel free to contact us.

Thank you!

tbozhong commented 3 months ago

Sorry for any inconvenience.

It appears that you are using a different version of llama-160m from us. Please try JackFram/llama-160m to align with our setting.

zxlzr commented 3 months ago

Hi, have you solved your issue yet?

BUGLI27 commented 2 days ago

Sorry for any inconvenience.

It appears that you are using a different version of llama-160m from us. Please try JackFram/llama-160m to align with our setting.

Hi, I tried JackFram/llama-160m, but the error still persists when running SERAC method and the error message is as follows:

metrics, edited_model, _ = editor.edit(
  File "/home/hli/project/KnowledgeEdit/EasyEdit/examples/../easyeditor/editors/editor.py", line 271, in edit
    edited_model, weights_copy = self.apply_algo(
  File "/home/hli/project/KnowledgeEdit/EasyEdit/examples/../easyeditor/models/serac/serac_main.py", line 68, in apply_to_model
    self.init_model(model, tok, hparams)
  File "/home/hli/project/KnowledgeEdit/EasyEdit/examples/../easyeditor/models/serac/serac_main.py", line 34, in init_model
    self.alg.load_state_dict(d["model"], False)
  File "/home/hli/project/KnowledgeEdit/EasyEdit/examples/../easyeditor/trainer/algs/SERAC.py", line 123, in load_state_dict
    assert len(res.unexpected_keys) == 0, print(res.unexpected_keys)
AssertionError: None 

And the ouput of res.unexpected_keys is:

['replacement.model.layers.0.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.1.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.2.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.3.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.4.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.5.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.6.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.7.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.8.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.9.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.10.self_attn.rotary_emb.inv_freq', 'replacement.model.layers.11.self_attn.rotary_emb.inv_freq']

I checked the code and found that this error was caused by

d = torch.load(params.archive, map_location='cpu')
self.alg.load_state_dict(d["model"], False)

and the params.archive was set to the path of the JackFram/llama-160m.

Do you have any ideas to solve this? Thank you.