GPT2-XL and GPT-J evaluating with ZsRE

zjunlp / EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

https://zjunlp.github.io/project/KnowEdit

MIT License

1.92k stars 235 forks source link

GPT2-XL and GPT-J evaluating with ZsRE #13

Closed msumitaml closed 1 year ago

msumitaml commented 1 year ago

Hello,

I would like to appreciate the detailed and good work.

It would be really helpful if you could please tell me if we can evaluate editing models with ZsRE dataset on GPT2-XL and GPT-J model or any other model other than Llama-2 as given in the example folder. How can we do that?

Thanks in advance!

pengzju commented 1 year ago

EasyEdit can be easily applied to any supported model. For example, if you want to edit GPT-J through ROME, you can run:

python run_zsre_llama2.py \
    --editing_method=ROME \
    --hparams_dir=../hparams/ROME/gpt-j-6B \
    --data_dir=./data

The script is called run_zsre_llama2 because currently only key status on llama2 is open for MEMIT methods.
- We will provide more status in the future and rename this script to run_zsre

Thank you for your reminder.

If it helps you solve the problem, please close this issue.

msumitaml commented 1 year ago

Hello,

Thank you for your response. I have tried with FT, IKE and ROME and it worked for GPT2-XL and GPT-J-6B but it doesn't work for MEMIT. It is showing "ValueError: BuilderConfig '20200501.en' not found. Available: ['20220301.aa....".

I have downloaded model.layers.4.mlp.down_proj_float32_mom2_100000.npz etc and put it in the folder ./data/stats/._hugging_cache_gpt2-xl/wikipedia_stats/. What else should I do so that I can run MEMIT as well. Please let me know.

Thank you!

pengzju commented 1 year ago

As you can see in README, MEMIT cannot bypass the computation of second-order momentum, so it requires the npz related to Wikipedia. However, we have only opened the key status of llama2 at the moment.

You can download the second-order momentum of gpt2-xl and gpt-j-6B used in ROME and MEMIT at website of MIT. https://rome.baulab.info/data/stats/
If you are testing the performance on the llama2, you can simply use the npz provided by us.

pengzju commented 1 year ago

If it helps you solve this problem, please close the issue. And if you want to ask a new question, please open a new issue.

qizhou000 commented 1 year ago

Hello,

Thank you for your response. I have tried with FT, IKE and ROME and it worked for GPT2-XL and GPT-J-6B but it doesn't work for MEMIT. It is showing "ValueError: BuilderConfig '20200501.en' not found. Available: ['20220301.aa....".

I have downloaded model.layers.4.mlp.down_proj_float32_mom2_100000.npz etc and put it in the folder ./data/stats/._hugging_cache_gpt2-xl/wikipedia_stats/. What else should I do so that I can run MEMIT as well. Please let me know.

Thank you!

You can download the second-order momentum of gpt2-xl and gpt-j-6B used in ROME and MEMIT at website of MIT. https://rome.baulab.info/data/stats/

pengzju commented 1 year ago

Hello, Thank you for your response. I have tried with FT, IKE and ROME and it worked for GPT2-XL and GPT-J-6B but it doesn't work for MEMIT. It is showing "ValueError: BuilderConfig '20200501.en' not found. Available: ['20220301.aa....". I have downloaded model.layers.4.mlp.down_proj_float32_mom2_100000.npz etc and put it in the folder ./data/stats/._hugging_cache_gpt2-xl/wikipedia_stats/. What else should I do so that I can run MEMIT as well. Please let me know. Thank you!

You can download the second-order momentum of gpt2-xl and gpt-j-6B used in ROME and MEMIT at website of MIT. https://rome.baulab.info/data/stats/

Thank you for your reply, downloading these weights directly is the best solution, I have updated the original reply