ymcui / Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki
Apache License 2.0
18.23k stars 1.86k forks source link

TypeError: __init__() got an unexpected keyword argument 'merge_weights' #876

Closed IsraelAbebe closed 8 months ago

IsraelAbebe commented 9 months ago

Check before submitting issues

Type of Issue

Model training and fine-tuning

Base Model

LLaMA-7B

Operating System

Linux

Describe your issue in detail

lr=1e-4
lora_rank=8
lora_alpha=32
lora_trainable="q_proj,v_proj,k_proj,o_proj,gate_proj,down_proj,up_proj"
modules_to_save="embed_tokens,lm_head"
lora_dropout=0.05

pretrained_model='daryl149/llama-2-7b-hf'
chinese_tokenizer_path='chinese_llama_lora_7b/'
dataset_dir='../../data/'
per_device_train_batch_size=1
per_device_eval_batch_size=1
gradient_accumulation_steps=8
output_dir=output_dir
peft_model=chinese_llama_lora_7b/
validation_file=../../data/alpaca_data_zh_51k.json

deepspeed_config_file=ds_zero2_no_offload.json

torchrun --nnodes 1 --nproc_per_node 1 run_clm_sft_with_peft.py \
    --deepspeed ${deepspeed_config_file} \
    --model_name_or_path ${pretrained_model} \
    --tokenizer_name_or_path ${chinese_tokenizer_path} \
    --dataset_dir ${dataset_dir} \
    --validation_split_percentage 0.001 \
    --per_device_train_batch_size ${per_device_train_batch_size} \
    --per_device_eval_batch_size ${per_device_eval_batch_size} \
    --do_train \
    --do_eval \
    --seed $RANDOM \
    --fp16 \
    --num_train_epochs 1 \
    --lr_scheduler_type cosine \
    --learning_rate ${lr} \
    --warmup_ratio 0.03 \
    --weight_decay 0 \
    --logging_strategy steps \
    --logging_steps 10 \
    --save_strategy steps \
    --save_total_limit 3 \
    --evaluation_strategy steps \
    --eval_steps 100 \
    --save_steps 200 \
    --gradient_accumulation_steps ${gradient_accumulation_steps} \
    --preprocessing_num_workers 8 \
    --max_seq_length 512 \
    --output_dir ${output_dir} \
    --overwrite_output_dir \
    --ddp_timeout 30000 \
    --logging_first_step True \
    --lora_rank ${lora_rank} \
    --lora_alpha ${lora_alpha} \
    --trainable ${lora_trainable} \
    --modules_to_save ${modules_to_save} \
    --lora_dropout ${lora_dropout} \
    --torch_dtype float16 \
    --validation_file ${validation_file} \
    --gradient_checkpointing \
    --ddp_find_unused_parameters False \
    --peft_path ${peft_model} \

Execution logs or screenshots

model = PeftModel.from_pretrained(model, training_args.peft_path) File "/home/azime/.local/lib/python3.9/site-packages/peft/peft_model.py", line 323, in from_pretrained config = PEFT_TYPE_TO_CONFIG_MAPPING[ File "/home/azime/.local/lib/python3.9/site-packages/peft/config.py", line 137, in from_pretrained config = config_cls(**kwargs) TypeError: init() got an unexpected keyword argument 'merge_weights'

github-actions[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.

github-actions[bot] commented 8 months ago

Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.