AGI-Edgerunners / LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
Apache License 2.0
1.01k stars 91 forks source link

[Bug] Lora finetuning memory keeps rising until it is Out Of Memory #33

Open angelOnly opened 11 months ago

angelOnly commented 11 months ago

[Bug] Lora微调内存一直上涨直到Out Of Memory

中文:其他项目中都会增加参数overwrite_cache去释放暴涨的显存,这个项目我加这个参数没用,请问有可能解决吗? english: in other projects will add parameter overwrite_cache to release the skyrocketing video memory. In this project, it is useless for me to add this parameter. Is it possible to solve this problem?


HZQ950419 commented 11 months ago


I can't reproduce the bug. I tried with

CUDA_VISIBLE_DEVICES=0 python --base_model 'yahma/llama-7b-hf' --data_path 'math_10k.json' --output_dir './trained_models/test/' --batch_size 128 --micro_batch_size 4 --num_epochs 3 --learning_rate 3e-4 --cutoff_len 256 --val_set_size 120 --eval_step 80 --save_step 80 --adapter_name lora --load_8bit --target_modules '["up_proj", "down_proj"]'

the memory used is 18469 MB, and it is not keeping increasing. I used a single 3090 for the testing. Could you inform more information like the command, pytorch version, GPU and so on?

BTW, the argument --overwrite_cache is used for loading cached dataset in [ChatGLM2-6B]([ptuning]( / line 234, 253, and 272.
