QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
13.59k stars 1.11k forks source link

[BUG] <title>微调时多轮对话数据格式问题 #1157

Closed Jarvanen closed 6 months ago

Jarvanen commented 6 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

使用 lora单卡微调 对话采用多轮格式 对话数据约2000条 LR:3e-4 epoch:5和10

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

使用的多轮对话没有按照user assistant 一问一答,会有多个user对应一个assistant或多个user对应一个assistant,这样格式的数据会对微调效果有影响吗,微调出的模型合并后能够正常问答,但是对于提示词的指令遵从效果不好,有时候不会按照指令执行,而是直接输出微调中的某些数据。

Jarvanen commented 6 months ago

微调后的模型也跑了测评集ceval之类的 评分只下降了一点点 1.8B 7B 14B都试过 都有不完全遵从提示词和直接输出微调中数据的问题,是否也有可能是因为多轮对话中某一段话出现频率太高的问题 因为是客服对话数据,客服会有一些固定回答

jklj077 commented 6 months ago

The finetune.py does indeed support the user/assistant dialogue structure where messages can appear in an arbitrary order, but the models have not been trained on data formatted in this way.

This might imply that the model could potentially overfit to your specific dataset, where data is arranged in a novel manner.

To mitigate this issue, we recommend one of two actions:

  1. You could revise your data to conform to the user/assistant format in a fixed order, ensuring that the model adapts appropriately during fine-tuning.
  2. Alternatively, consider diversifying your dataset by including various conversation structures to enhance the model's adaptability and reduce the risk of overfitting to a single conversation pattern.

Please note that these suggestions are based on general best practices in machine learning and may not be a final resolution to your problem.

Jarvanen commented 6 months ago

The finetune.py does indeed support the user/assistant dialogue structure where messages can appear in an arbitrary order, but the models have not been trained on data formatted in this way.

This might imply that the model could potentially overfit to your specific dataset, where data is arranged in a novel manner.

To mitigate this issue, we recommend one of two actions:

  1. You could revise your data to conform to the user/assistant format in a fixed order, ensuring that the model adapts appropriately during fine-tuning.
  2. Alternatively, consider diversifying your dataset by including various conversation structures to enhance the model's adaptability and reduce the risk of overfitting to a single conversation pattern.

谢谢,想再请教一下,用少量(几千条对话)数据lora微调(参数用的脚本默认值),模型合并后不能很好遵从提示词是正常现象吗?比如微调前让模型把输出结果转化成json格式输出没有问题,但微调后再让模型把输出结果转化成json格式完全没有效果。微调的数据里不包含json相关的内容,是需要加上吗?

jklj077 commented 6 months ago

The phenomenon suggests catastrophic forgetting and it could be related to the hyper-parameter settings in finetuning. Balancing the data is normally a viable way but whether it works depends on your specific usecase.