Is there any solution to reduce the usage of GPU mem to run ROME(wikibio and convsent) and MEND(in all datasets)-A40(48G)

zjunlp / EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

https://zjunlp.github.io/project/KnowEdit

MIT License

1.74k stars 210 forks source link

Is there any solution to reduce the usage of GPU mem to run ROME(wikibio and convsent) and MEND(in all datasets)-A40(48G) #246

Closed Blank141 closed 4 months ago

Blank141 commented 4 months ago

When i try to reproduces the report results in paper, I encounter the OOM error my device is A40*2(48G Each GPU)

following is my command ' python run_knowedit_llama2.py --editing_method=ROME --hparams_dir=../hparams/ROME/llama-7b --data_dir=./data/KnowEdit/benchmark/wikibio/test-all.ar.json --datatype='wikibio' --ds_size=10

python run_convsent_llama2.py --hparams_dir ../hparams/ROME/llama-7b.yaml --editing_method ROME --data_dir ./data/KnowEdit/benchmark/convsent --ds_size=10 ' even in ds_size=10, ROME run out of my memory(in wikibio and consent)

MEND usage is 46G and only available in one GPU, but my A40 has 48G mem

XeeKee commented 4 months ago

Hello, the ROEM method's easyedit supports model parallelism. You can set model_parallel: true in the hyperparameters, which will reduce the memory usage on a single card. The MEND method does not support model parallelism, so unfortunately, I don't have a good way to assist you with that. I'm sorry.

Blank141 commented 4 months ago

Thank you! and I have set 'model_parallel: true' and 'device: 0,1' in ROME experiments . The OOM problem still exist. only for wikibio and convsent...others work very well

XeeKee commented 4 months ago

Because the text length of wikibio and convsent is longer, it leads to larger memory usage. I may not have particularly good suggestions to help you solve the problem.

Blank141 commented 4 months ago

Okay. thank you very much