beyondguo / LLM-Tuning

Tuning LLMs with no tears💦; Sample Design Engineering (SDE) for more efficient downstream-tuning.
956 stars 98 forks source link

为什么训练阶段的显存一直在往上涨?一会就 OOM 了 #45

Closed Amazing-J closed 1 year ago

Amazing-J commented 1 year ago

CUDA_VISIBLE_DEVICES=4 python baichuan_lora_tuning.py \ --tokenized_dataset hc3_chatgpt_zh_specific_qa_baichuan-7B \ --lora_rank 4 \ --per_device_train_batch_size 64 \ --gradient_accumulation_steps 2 \ --num_train_epochs 2 \ --save_steps 200 \ --save_total_limit 2 \ --learning_rate 1e-4 \ --fp16 \ --remove_unused_columns false \ --logging_steps 20 \ --output_dir weights/hc3_chatgpt_zh_specific_qa_baichuan-7B

beyondguo commented 1 year ago

你这个batch size有点奢侈了,显存也是动态的,如果训练前期的文本都不长,可能显存还够,后面碰到一个超长的文本,可能就不够了