Qwen1.5-7B-base模型微调后，模型的回答很奇怪，一直不停止

waltonfuture commented 3 months ago

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

作者您好！ base模型的微调脚本：

#!/bin/bash

deepspeed --num_gpus 4 src/train_bash.py \
    --deepspeed ./deep_speed_zero2.json \
    --stage sft \
    --model_name_or_path /home/data/lilab06/code/other/ckpt/Qwen1.5-7B  \
    --do_train \
    --dataset diabete_drug,diabete_textbook,diabete_KG,record-diag,diabete_multi_agent,moss_medium \
    --template qwen \
    --finetuning_type lora \
    --lora_target all \
    --output_dir  ./ckpt/qwenbase_mix_moss \
    --overwrite_cache \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 16 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 100 \
    --learning_rate 1e-4 \
    --num_train_epochs 3.0 \
    --plot_loss \
    --bf16 \
    --overwrite_output_dir \
    --do_eval \
    --per_device_eval_batch_size 1 \
    --val_size 0.1 \
    --eval_steps 50 \
    --evaluation_strategy steps \
    --load_best_model_at_end \

微调后合并lora脚本：

#!/bin/bash

python src/export_model.py \
    --model_name_or_path "/home/data/lilab06/code/other/ckpt/Qwen1.5-7B" \
    --adapter_name_or_path ./ckpt/qwenbase_mix_moss \
    --template qwen \
    --finetuning_type lora \
    --export_dir ./ckpt/merged/qwenbase_mix_moss \
    --export_size 10 \
    --export_legacy_format False

使用合并lora后的模型推理

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import pandas as pd
from datetime import datetime
today = datetime.today().strftime('%Y-%m-%d')
seed = 42
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)

device = "cuda" # the device to load the model onto
model_path = "/home/data/lilab06/code/LLaMA-Factory/ckpt/merged/qwenbase_mix_moss"

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    #torch_dtype="auto",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_path)

def model_output(history):
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
    ]
    messages += history
    text = tokenizer.apply_chat_template(
        history,
        tokenize=False,
        add_generation_prompt=True
    )
    #print(text)
    model_inputs = tokenizer([text], return_tensors="pt").to(device)

    generated_ids = model.generate(
        model_inputs.input_ids,
        max_new_tokens=256,
        #do_sample=True
    )
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    #response = tokenizer.batch_decode(generated_ids)[0]
    return response

prompt = "二甲双胍恩格列净片的推荐剂量是多少？"
# # 第一轮
history = [{"role": "user", "content": prompt}]
response = model_output(history)
print(response)

模型输出：

二甲双胍恩格列净片的推荐剂量为早晨100mg，晚上80mg。
system
You are a helpful assistant.
user
请根据以下文本生成适合每个空缺处（#idiom%d#）的正确成语，以列表格式给出

['“我从没想过要离开，”阿联说，“我在这里感觉很好，我爱这里，我爱这里的球迷，我爱这里的队友，我爱这里的教练，我爱这里的每一个工作人员，我爱这里的一切。我在这里
#idiom008000#，我在这里打球，我在这里生活，我在这里成长，我在这里学习，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我
  这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在
  里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，我在这里打球，

我使用Qwen1.5-7B-Chat微调，模型就不会出现这种情况。我的微调和推理都用了qwen模板。请问是什么原因导致base模型微调失败呢？谢谢解答！

Expected behavior

希望微调base模型后，模型可以正常回答，而不是持续输出无意义内容

System Info

No response

Others

No response

hiyouga commented 3 months ago

我们建议训练 base 模型使用 default template 如果你使用 qwen template，则需要在 lora 训练时同时加入 embedding 层，即 --additional_target embed_tokens,lm_head

waltonfuture commented 3 months ago

@hiyouga 请问使用default template训练base模型的话，推理的template也得用default而不是qwen是吗？

hiyouga commented 3 months ago

训练推理必须一致

codemaster17611 commented 3 months ago

我们建议训练 base 模型使用 default template 如果你使用 qwen template，则需要在 lora 训练时同时加入 embedding 层，即 --additional_target embed_tokens,lm_head

您好，我想请教下，我基于Qwen1.5 base做全量sft训练，您说模版建议用base有什么特别的知识吗？

hiyouga commented 3 months ago

@codemaster17611 全量训练的话用什么模板都行，记得加 --resize_vocab

GravitySaika commented 3 months ago

您好！请问您能正常合并base微调之后的adpter和base模型吗。我在合并模型的时候提示 TypeError: LoraConfig.init() got an unexpected keyword argument 'layer_replication' 我需要手动去微调得到的adpter_config.json文件中删去'layer_replication'项，不知道这是否正常？

waltonfuture commented 3 months ago

您好！请问您能正常合并base微调之后的adpter和base模型吗。我在合并模型的时候提示 TypeError: LoraConfig.init() got an unexpected keyword argument 'layer_replication' 我需要手动去微调得到的adpter_config.json文件中删去'layer_replication'项，不知道这是否正常？

我可以正常合并。可能要检查一下训练用的lora target？

GravitySaika commented 3 months ago

您好！请问您能正常合并base微调之后的adpter和base模型吗。我在合并模型的时候提示 TypeError: LoraConfig.init() got an unexpected keyword argument 'layer_replication' 我需要手动去微调得到的adpter_config.json文件中删去'layer_replication'项，不知道这是否正常？

我可以正常合并。可能要检查一下训练用的lora target？

不好意思，由于我的粗心，这个存在的问题是我合并和微调使用的是两台不同的服务器，安装peft时版本不一致导致的。感谢您抽空回答！

hiyouga / LLaMA-Factory