[BUG] 多轮对话数据微调训练，token_type_ids为None - Githubissues

QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Apache License 2.0

13.59k stars 1.11k forks source link

[BUG] 多轮对话数据微调训练，token_type_ids为None #1096

Closed sunyclj closed 7 months ago

sunyclj commented 7 months ago

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

[X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

[X] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

多轮对话数据进行微调训练，训练数据处理逻辑是怎样的呢？相关代码在什么位置？如Q1A1Q2A2Q3A3三轮对话数据参与训练，应该形成3组数据对才对，为什么token_type_ids为None呢？如果没有对多轮对话处理，模型训练是否就只有一个样本参与训练，即Q1A1Q2A2Q3-->A3？

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response

jklj077 commented 7 months ago

https://github.com/QwenLM/Qwen/blob/85cb093f20e1faa1d772d6abe8458661f3b09202/finetune.py#L125