SkyworkAI / Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
Other
1.21k stars 111 forks source link

请问什么时候会再度开放开源数据集? #48

Closed Johnson-Ding closed 9 months ago

Johnson-Ding commented 9 months ago

目前的开源数据地址被 404 了

zhao1iang commented 9 months ago

您好,数据集在升级中,近期会重新开放

yuanye1010 commented 9 months ago

您好,数据集在升级中,近期会重新开放

您好,大概什么时候能够重新开放呢?

liushuiwuhenxi commented 9 months ago

您好,数据集在升级中,近期会重新开放

什么时候会再次开源呢

TianwenWei commented 9 months ago

https://huggingface.co/datasets/Skywork/SkyPile-150B 我们的数据集安全审核后已重新开放,造成的不便敬请谅解。