QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
13.59k stars 1.11k forks source link

[BUG] 多轮对话数据微调训练,token_type_ids为None #1096

Closed sunyclj closed 7 months ago

sunyclj commented 7 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

多轮对话数据进行微调训练,训练数据处理逻辑是怎样的呢?相关代码在什么位置?如Q1A1Q2A2Q3A3三轮对话数据参与训练,应该形成3组数据对才对,为什么token_type_ids为None呢?如果没有对多轮对话处理,模型训练是否就只有一个样本参与训练,即Q1A1Q2A2Q3-->A3?

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response

jklj077 commented 7 months ago

https://github.com/QwenLM/Qwen/blob/85cb093f20e1faa1d772d6abe8458661f3b09202/finetune.py#L125