THUDM / ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Apache License 2.0
40.47k stars 5.19k forks source link

[BUG/Help] <title>使用自己的训练集训练的时候出现了'NoneType' object is not iterable报错 #1426

Open Rita1012 opened 10 months ago

Rita1012 commented 10 months ago

Is there an existing issue for this?

Current Behavior

Traceback (most recent call last): File "/home/user01/pytorch_env/ChatGLM-6B-main/ptuning/main.py", line 430, in main() File "/home/user01/pytorch_env/ChatGLM-6B-main/ptuning/main.py", line 248, in main train_dataset = train_dataset.map( File "/home/user01/pytorch_env/pytorch_env/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 591, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs) File "/home/user01/pytorch_env/pytorch_env/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 556, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, args, kwargs) File "/home/user01/pytorch_env/pytorch_env/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3089, in map for rank, done, content in Dataset._map_single(dataset_kwargs): File "/home/user01/pytorch_env/pytorch_env/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3466, in _map_single batch = apply_function_on_filtered_inputs( File "/home/user01/pytorch_env/pytorch_env/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3345, in apply_function_on_filtered_inputs processed_inputs = function(fn_args, additional_args, **fn_kwargs) File "/home/user01/pytorch_env/ChatGLM-6B-main/ptuning/main.py", line 203, in preprocess_function_train for turn_idx, (old_query, response) in enumerate(history): TypeError: 'NoneType' object is not iterable

所使用的数据集示例如下: {"prompt": "四个月感冒咳嗽孝鸣。四个月鼻塞然后当日晚上喂了紫苏陈皮薄荷红糖水第二天喂一天,第三天好转没有鼻涕鼻塞却咳嗽,咳的不算严重还能接受,第四天下午带有孝鸣声支支响,第五天偶尔咳但只要一醒着他呼吸带有孝鸣声,这么小孩不药物干预能好吗,这样病会不会一着凉就发作,现在有什么处理?", "response": "您好,根据您的描述,孩子出现了四个月的感冒咳嗽和鼻塞症状,但在喂了紫苏陈皮薄荷红糖水后,鼻塞好转但出现了咳嗽。咳嗽声音中带有孝鸣声,这种情况可能是毛细支气管炎,属于肺炎的一种。由于孩子年龄较小,我建议您带孩子去医院就诊,以便进行进一步的检查和治疗。", "history": []}

{"prompt": "如果我适合做微创双眼皮手术,您能给我一些建议吗?", "response": "当然可以。如果您适合进行微创双眼皮手术,我建议您在手术前详细了解手术的过程和风险,并与医生充分沟通。此外,术后的护理也非常重要,您需要按照医生的建议进行恢复和护理,避免过度用眼和受伤。请记住,每个人的情况都是独特的,所以请务必在专业医生的指导下进行决策和操作。", "history": [["我现在想做双眼皮,有影响吗?我做斜视手术已经快2年了,能做微创双眼皮吗?", "非常理解您的需求。一般来说,斜视手术对于双眼皮手术并没有直接影响。关于做微创双眼皮的可行性,我们需要根据您的眼睛条件来做进一步的评估。如果您的眼睛条件适合,那么微创双眼皮手术是一种较为安全和有效的选择。"]]}

Expected Behavior

能正常训练自己的数据集

Steps To Reproduce

Run bash train_chat.sh

Environment

- OS:Ubuntu
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :False

Anything else?

No response