yanqiangmiffy / InstructGLM

ChatGLM-6B 指令学习|指令数据|Instruct
MIT License
654 stars 51 forks source link

用tokenizer_dataset_rows.py转换自己的数据报错datasets.builder.datasetgeneraationerror #33

Open cat1222 opened 1 year ago

cat1222 commented 1 year ago

按照前序预处理成json格式后,运行tokenizer_dataset_row.py报错: dataset=datasets.Dataset.from_generator(..) ... file"/usr/local/python3.8.3/lib/python3.8/lib/python3.8/site-packages/datasets/builder.py", line 1644, in _prepare_split_single datasets.builder.datasetgeneraationerror: an error occurred wile generating the dataset.