yaojin17 / Unlearning_LLM

[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
MIT License
46 stars 2 forks source link

How much video memory is needed for this experiment? #9

Open carbonatedbeverages opened 3 weeks ago

carbonatedbeverages commented 3 weeks ago

I set the parameter gradient_accumulation_steps to 1,bachsize to 1 and use LoRA to make the number of trainable parameters reduce to 3,276,800.However,with two v100(32G),I still can't run this experiment for CUDA out of memory. What other methods can reduce the need for video memory?

yaojin17 commented 2 weeks ago

Hi, I use 8 A100 GPUs with 80GB of memory each to fine-tune the model. For your case, I suggest using FP16 training and reducing the number of LoRA trainable parameters to conduct the experiments.