hiyouga / LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
https://arxiv.org/abs/2403.13372
Apache License 2.0
29.63k stars 3.64k forks source link

合并qwen1.5-1.8B Base模型为微调权重时出现错误 #2896

Closed GravitySaika closed 5 months ago

GravitySaika commented 5 months ago

Reminder

Reproduction

这是我微调时使用的参数

CUDA_VISIBLE_DEVICES=1 python src/train_bash.py \
    --stage sft \
    --do_train \
    --model_name_or_path ./Qwen1.5-1.8B \
    --dataset finetune_set \
    --template default \
    --finetuning_type lora \
    --lora_target q_proj,v_proj \
    --output_dir ./Qwen1.5-1.8B/finetune \
    --overwrite_cache \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --learning_rate 5e-5 \
    --num_train_epochs 1.0 \
    --plot_loss \
    --fp16

这是我合并时使用的参数

CUDA_VISIBLE_DEVICES=5 python src/export_model.py \
    --model_name_or_path ./Qwen1.5-1.8B \
    --adapter_name_or_path ./Qwen1.5-1.8B/finetune \
    --template default \
    --finetuning_type lora \
    --export_dir ./Qwen1.5-1.8B/finetune/merge \
    --export_size 10 \
    --export_legacy_format False

合并时提示我

03/19/2024 23:11:40 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA                                                                                           
Traceback (most recent call last):                                                                                                                                       
  File "/mnt/sm870/yangqimin/LLaMA-Factory/src/export_model.py", line 9, in <module>                                                                                     
    main()                                                                                                                                                               
  File "/mnt/sm870/yangqimin/LLaMA-Factory/src/export_model.py", line 5, in main                                                                                         
    export_model()                                                                                                                                                       
  File "/mnt/sm870/yangqimin/LLaMA-Factory/src/llmtuner/train/tuner.py", line 52, in export_model                                                                        
    model, tokenizer = load_model_and_tokenizer(model_args, finetuning_args)                                                                                             
  File "/mnt/sm870/yangqimin/LLaMA-Factory/src/llmtuner/model/loader.py", line 150, in load_model_and_tokenizer                                                          
    model = load_model(tokenizer, model_args, finetuning_args, is_trainable, add_valuehead)                                                                              
  File "/mnt/sm870/yangqimin/LLaMA-Factory/src/llmtuner/model/loader.py", line 94, in load_model                                                                         
    model = init_adapter(model, model_args, finetuning_args, is_trainable)                                                                                               
  File "/mnt/sm870/yangqimin/LLaMA-Factory/src/llmtuner/model/adapter.py", line 110, in init_adapter                                                                     
    model: "LoraModel" = PeftModel.from_pretrained(model, adapter)                                                                                                       
  File "/home/sun/anaconda3/envs/jc-train/lib/python3.10/site-packages/peft/peft_model.py", line 324, in from_pretrained                                                 
    config = PEFT_TYPE_TO_CONFIG_MAPPING[                                                                                                                                
  File "/home/sun/anaconda3/envs/jc-train/lib/python3.10/site-packages/peft/config.py", line 151, in from_pretrained                                                     
    return cls.from_peft_type(**kwargs)                                                                                                                                  
  File "/home/sun/anaconda3/envs/jc-train/lib/python3.10/site-packages/peft/config.py", line 118, in from_peft_type                                                      
    return config_cls(**kwargs)                                                                                                                                          
TypeError: LoraConfig.__init__() got an unexpected keyword argument 'layer_replication' 

我通过删除微调的目标文件夹中adapter_config.json文件中的'layer_replication' 项,能够成功合并权重并使用,我不清楚这是否是一个BUG,以及这样操作是否会带来其他问题。

Expected behavior

我原本希望使用我自己的数据集微调qwen1.5模型并合并权重进行使用

System Info

Others

No response

hiyouga commented 5 months ago

怀疑是两次命令的 peft 环境不同

GravitySaika commented 5 months ago

怀疑是两次命令的 peft 环境不同

确为peft环境不同,感谢!

kkk935208447 commented 5 months ago

怀疑是两次命令的 peft 环境不同

谢谢

PlanetesDDH commented 5 months ago

666

Yezhibin701227 commented 3 months ago

CUDA_VISIBLE_DEVICES=5 python src/export_model.py , export_model.py这个脚本我怎么没有找到?

aihaidong commented 1 week ago

我也遇到同样问题,确实是,感谢大佬