SkyworkAI / Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
Other
1.21k stars 111 forks source link

请问您开源的150B数据集huggingface上怎么不能下载了? #30

Closed sunzhuojun closed 9 months ago

sunzhuojun commented 10 months ago

谢谢

bltcn commented 10 months ago

同问

zhangxiaofan-zy commented 10 months ago

同问

yw10 commented 10 months ago

Skywork: A More Open Bilingual Foundation Model

404 Sorry, we can't find the page you are looking for.

grapefruit-pig commented 10 months ago

同问

kbwzy commented 10 months ago

同问

onlyfew commented 10 months ago

预计要评估多久呢? @zhao1iang

auuuux commented 10 months ago

同问

ForeverNewLee commented 10 months ago

同问,或者目前有别的可下载的地方么

TianwenWei commented 10 months ago

应该一两周内数据集会重新open

spicy-onion-Hit commented 10 months ago

同求

genggui001 commented 10 months ago

一周过去了,再求

TianwenWei commented 10 months ago

有关部门正在审核数据,还请大家谅解

datalee commented 10 months ago

同问

ChanChiChoi commented 10 months ago

同问

TianwenWei commented 9 months ago

https://huggingface.co/datasets/Skywork/SkyPile-150B 我们的数据集安全审核后已重新开放,造成的不便敬请谅解。

genggui001 commented 9 months ago

https://huggingface.co/datasets/Skywork/SkyPile-150B 我们的数据集安全审核后已重新开放,造成的不便敬请谅解。

我对比了一下之前下载的版本 有不少文件少了一半还多 正常的?

TianwenWei commented 9 months ago

https://huggingface.co/datasets/Skywork/SkyPile-150B 我们的数据集安全审核后已重新开放,造成的不便敬请谅解。

我对比了一下之前下载的版本 有不少文件少了一半还多 正常的?

根据国家法律法规我们删除了一些有安全风险的内容,另外补充了一些新数据。