open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
https://opencompass.org.cn/
Apache License 2.0
4.04k stars 428 forks source link

openCompass中若使用CMMLU数据集对大模型进行评测,最终得分50分,这个50的得分是怎么计算的?具体代码在哪 #987

Closed liujie178 closed 7 months ago

liujie178 commented 7 months ago

Describe the feature

openCompass中若使用CMMLU数据集对大模型进行评测,最终得分50分,这个50的得分是怎么计算的?具体代码在哪

Will you implement it?

liushz commented 7 months ago

You can refer to this summarizer config: /opencompass/configs/summarizers/groups/cmmlu.py

tonysy commented 7 months ago

Feel free to re-open if needed.