Closed badmic closed 4 months ago
训练的脚本为:deepspeed --num_gpus 4 src/train_bash.py \ --deepspeed ds_config.json \ --stage dpo \ --do_train True \ --model_name_or_path /home/LLaMA-Factory-main/saves/Qwen1.5-14B-Chat/freeze/train_2024-04-25001/ \ --finetuning_type lora \ --template qwen \ --dataset_dir data \ --dataset compare_data_wenda \ --cutoff_len 2048 \ --learning_rate 0.0001 \ --num_train_epochs 20.0 \ --max_samples 100000 \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --max_grad_norm 1.0 \ --logging_steps 5 \ --save_steps 100 \ --warmup_steps 0 \ --lora_rank 8 \ --lora_dropout 0.1 \ --lora_target c_attn \ --create_new_adapter True \ --output_dir saves/Qwen-14B-Chat/lora/dpo_lora_wenda_001 \ --fp16 True \ --dpo_beta 0.1 \ --dpo_ftx 0 \ --plot_loss True 出现报错:ValueError: Target modules {'c_attn'} not found in the base model. Please check the target modules and try again.
No response
template 没选对
@hiyouga 感谢作者,想问一下Qwen1.5-14B-chat的template不是qwen嘛
说错了,lora_target
那dpo训练应该选择什么好呢
OK,把这个参数删掉就可以了,感谢作者
你可能用的是qwen2,和qwen1的结构不一样,需要改
下面是qwen2的:
这是qwen1的:
Reminder
Reproduction
训练的脚本为:deepspeed --num_gpus 4 src/train_bash.py \ --deepspeed ds_config.json \ --stage dpo \ --do_train True \ --model_name_or_path /home/LLaMA-Factory-main/saves/Qwen1.5-14B-Chat/freeze/train_2024-04-25001/ \ --finetuning_type lora \ --template qwen \ --dataset_dir data \ --dataset compare_data_wenda \ --cutoff_len 2048 \ --learning_rate 0.0001 \ --num_train_epochs 20.0 \ --max_samples 100000 \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --max_grad_norm 1.0 \ --logging_steps 5 \ --save_steps 100 \ --warmup_steps 0 \ --lora_rank 8 \ --lora_dropout 0.1 \ --lora_target c_attn \ --create_new_adapter True \ --output_dir saves/Qwen-14B-Chat/lora/dpo_lora_wenda_001 \ --fp16 True \ --dpo_beta 0.1 \ --dpo_ftx 0 \ --plot_loss True 出现报错:ValueError: Target modules {'c_attn'} not found in the base model. Please check the target modules and try again.
Expected behavior
No response
System Info
No response
Others
No response