hiyouga / LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs
Apache License 2.0
25.26k stars 3.13k forks source link

fsdp+fp16 全参数微调是否支持呢 #4437

Closed zhangfan-algo closed 5 days ago

zhangfan-algo commented 5 days ago

Reminder

System Info

另外fsdp+fp16 和zero3_offload+fp16那个更快一些呢

Reproduction

model

model_name_or_path: /mnt/cluster//models/Qwen/Qwen1.5-1.8B-Chat

method

stage: sft do_train: true do_eval: true finetuning_type: full deepspeed: /mnt/cluster/LLaMA-Factory_0614/examples/deepspeed/ds_z3_config.json

dataset

dataset: test template: qwen cutoff_len: 19500 overwrite_cache: true preprocessing_num_workers: 60

output

output_dir: test logging_steps: 10 logging_first_step: true save_total_limit: 5 save_strategy: epoch plot_loss: true

train

gradient_checkpointing: true per_device_train_batch_size: 1 gradient_accumulation_steps: 8 learning_rate: 1.0e-4 num_train_epochs: 5.0 lr_scheduler_type: linear warmup_ratio: 0.03 bf16: true ddp_timeout: 180000000 neftune_noise_alpha: 5

eval

val_size: 0.01 per_device_eval_batch_size: 1 eval_strategy: steps eval_steps: 50

Expected behavior

No response

Others

No response

hiyouga commented 5 days ago

支持 fsdp,参考 fsdp qlora 样例