SkyworkAI / Skywork

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
Other
1.21k stars 111 forks source link

add eval script #1

Closed BBuf closed 10 months ago

BBuf commented 10 months ago

mmul: pr: 62.12%, cangshui/prediction/mmlu/1018-3STEM8-3STEM9-3COT-2LASTTEN-0000001iter/result.txt: 61.8%

cmmlu: pr: 61.82%, cangshui/prediction/cmmlu/1018-3STEM8-3STEM9-3COT-2LASTTEN-0000001iter/result.txt: 61.22%

ceval: pr: 60.55%, cangshui/prediction/ceval/1018-3STEM8-3STEM9-3COT-2LASTTEN-0000001iter/result.txt: 59.45%

gsmk8k: pr: 54.8%, cangshui/prediction/gsm8k/1018-3STEM8-3STEM9-3COT-2LASTTEN-0000001iter/result.txt: 53.14%

mmlu/cmmlu/ceval/gsm8k 精度均符合预期。

chengtbf commented 10 months ago

在 ReadMe 里也需要写怎么执行这些脚本。或者在 .eval/ 目录下新增 README.md ,然后在首页里 link 跳转到这里

BBuf commented 10 months ago

在 ReadMe 里也需要写怎么执行这些脚本。或者在 .eval/ 目录下新增 README.md ,然后在首页里 link 跳转到这里

done