LukeForeverYoung / UReader

Apache License 2.0
111 stars 6 forks source link

使用DeepSpeed ZeRO-1+LoRA单机多卡训练的模型保存的checkpoint如何导出用于推理? #7

Open xuetaolue opened 11 months ago

xuetaolue commented 11 months ago

以如下方式将rola集成语言模型分枝,在模型训练完毕后,如何加载checkpoint用于推理呢?

peft_config = LoraConfig( target_modules=r'..(q_proj|v_proj)', inference_mode=args.inference_mode, r=args.lora_r, lora_alpha=args.lora_alpha, lora_dropout=args.lora_dropout ) model.language_model = get_peft_model(model.language_model, peft_config) model.language_model.print_trainable_parameters()

请问为何不使用如下常规的集成方式?

if args.language_training_method == 'lora': peft_config = LoraConfig( target_modules=r'.language_model..(q_proj|v_proj)', inference_mode=args.inference_mode, r=args.lora_r, lora_alpha=args.lora_alpha, lora_dropout=args.lora_dropout ) model = get_peft_model(model, peft_config) model.print_trainable_parameters()

LukeForeverYoung commented 10 months ago

The solution you put forward looks more practical and easy to use. Would you mind creating a pull request so I can fold your commits into the master branch?

xuetaolue commented 10 months ago

The solution you put forward looks more practical and easy to use. Would you mind creating a pull request so I can fold your commits into the master branch?

Of course, I will create a pull request to you. Here are some of the new findings: The second solution leads to a problem: eval_loss is lossed during training, but can be fixed by adding this argument in training args TrainingArguments(...,label_names=['labels’]) Detailed description here