SkyworkAI / Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
Other
1.21k stars 111 forks source link

如何使用load_from_disk加载skypile-150B #78

Open 00ffcc opened 5 months ago

00ffcc commented 5 months ago

我从huggingface下载了skypile-150B到本地,但是使用load_from_disk加载时一直报错 FileNotFoundError: Directory ./data is neither a Dataset directory nor a DatasetDict directory. ,是什么原因呢?谢谢。

00ffcc commented 5 months ago

我数据是用huggingface-cli下载的而不是save_to_disk保存的,可能有关系?