关于huanhuan-chat微调报错的问题

KMnO4-zx / huanhuan-chat

Chat-甄嬛是利用《甄嬛传》剧本中所有关于甄嬛的台词和语句，基于ChatGLM2进行LoRA微调得到的模仿甄嬛语气的聊天语言模型。

493 stars 45 forks source link

Closed JiQiangHuang closed 11 months ago

JiQiangHuang commented 12 months ago

因为刚入门大模型有很多东西不是很懂，我想咨询一下关于微调时候报错NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.的问题。您有时间解答一下吗？谢谢！

训练参数没有改，错误如下：从../dataset/train/lora/huanhuan.json加载数据集失败

NotImplementedError( "Loading a dataset cached in a LocalFileSystem is not supported. ")

KMnO4-zx commented 11 months ago

您好，有一个可能的原因是这样的：由于训练脚本是放在fine_tune/lora目录下的，所以数据集的正确加载路径应该是../../dataset/train/lora/huanhuan.json。

shiv-cd commented 11 months ago

@JiQiangHuang 检查一下 fsspec 是否安装正确，is_remote_filesystem 依赖它判断的。

anine09 commented 10 months ago

实际上该问题来自 https://github.com/huggingface/datasets/issues/6352 ，该问题由 fsspec 的破坏性更新引起，已在 datasets 2.14.6 https://github.com/huggingface/datasets/pull/6334 中修复，而本项目 requirements.txt 中为

datasets==2.12.0

@KMnO4-zx 建议重新检查依赖关系，并更新 requirements.txt