使用lora微调baichuan2-7b-base，3*V100（16G），还是OOM？到底需要多少内存

baichuan-inc / Baichuan2

A series of large language models developed by Baichuan Intelligent Technology

https://huggingface.co/baichuan-inc

Apache License 2.0

4.08k stars 293 forks source link

使用lora微调baichuan2-7b-base，3*V100（16G），还是OOM？到底需要多少内存 #162

Open lvjianxin opened 1 year ago

lvjianxin commented 1 year ago

如题，使用的默认数据 deepspeed --include=localhost:4,5,7 fine-tune.py \ --report_to "none" \ --data_path "data/belle_chat_ramdon_10k.json" \ --model_name_or_path "/home/admin/baichuan2/baichuan-inc/Baichuan2-7B-Base" \ --output_dir "output" \ --model_max_length 64 \ --num_train_epochs 1 \ --per_device_train_batch_size 1 \ --gradient_accumulation_steps 1 \ --save_strategy epoch \ --learning_rate 2e-5 \ --lr_scheduler_type constant \ --adam_beta1 0.9 \ --adam_beta2 0.98 \ --adam_epsilon 1e-8 \ --max_grad_norm 1.0 \ --weight_decay 1e-4 \ --warmup_ratio 0.0 \ --logging_steps 1 \ --gradient_checkpointing True \ --deepspeed ds_config.json \ --bf16 False \ --tf32 False \ --use_lora True