使用freeze微调后的Qwen-14B-chat进行dpo的训练出现报错： Target modules {'c_attn'} not found in the base model

badmic commented 4 months ago

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

训练的脚本为：deepspeed --num_gpus 4 src/train_bash.py \ --deepspeed ds_config.json \ --stage dpo \ --do_train True \ --model_name_or_path /home/LLaMA-Factory-main/saves/Qwen1.5-14B-Chat/freeze/train_2024-04-25001/ \ --finetuning_type lora \ --template qwen \ --dataset_dir data \ --dataset compare_data_wenda \ --cutoff_len 2048 \ --learning_rate 0.0001 \ --num_train_epochs 20.0 \ --max_samples 100000 \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --max_grad_norm 1.0 \ --logging_steps 5 \ --save_steps 100 \ --warmup_steps 0 \ --lora_rank 8 \ --lora_dropout 0.1 \ --lora_target c_attn \ --create_new_adapter True \ --output_dir saves/Qwen-14B-Chat/lora/dpo_lora_wenda_001 \ --fp16 True \ --dpo_beta 0.1 \ --dpo_ftx 0 \ --plot_loss True 出现报错：ValueError: Target modules {'c_attn'} not found in the base model. Please check the target modules and try again.

Expected behavior

No response

System Info

No response

Others

No response

hiyouga commented 4 months ago

template 没选对

badmic commented 4 months ago

@hiyouga 感谢作者，想问一下Qwen1.5-14B-chat的template不是qwen嘛

hiyouga commented 4 months ago

说错了，lora_target

badmic commented 4 months ago

那dpo训练应该选择什么好呢

badmic commented 4 months ago

OK，把这个参数删掉就可以了，感谢作者

threeFeetCat123 commented 3 months ago

你可能用的是qwen2，和qwen1的结构不一样，需要改

下面是qwen2的：

这是qwen1的：

hiyouga / LLaMA-Factory