open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
https://opencompass.org.cn/
Apache License 2.0
4.07k stars 431 forks source link

您好,请教一个问题,请问OpenCompass官网公开学术榜单中,测试的Qwen2模型中temperature、repetition_penalty、、top_p这些参数是设置多少的?模型部署的方式是采用什么方式部署的? #1523

Open 13416157913 opened 1 month ago

13416157913 commented 1 month ago

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

1

Reproduces the problem - code/configuration sample

1

Reproduces the problem - command or script

1

Reproduces the problem - error message

1

Other information

您好,请教一个问题,请问OpenCompass官网公开学术榜单中,测试的Qwen2模型中temperature、repetition_penalty、top_p这些参数是设置多少的?模型部署的方式是采用什么方式部署的? image

tonysy commented 1 month ago

We use greedy-decoding, use LMDeploy as inference backend

13416157913 commented 1 month ago

We use greedy-decoding, use LMDeploy as inference backend

Thanks your answer, and then how to set temperature、repetition_penalty、top_p and so on ?

tonysy commented 1 month ago

For greedy decoding, no need for these parameters

luhairong11 commented 2 weeks ago

For greedy decoding, no need for these parameters 我通过调用api服务测试,发现每次测试的结果分数都不一样,本地模型路径的方式加载测试是一样的,调用api测试是哪里设置不对吗,temperature=0,top_k=-1,top_p=1,可以查看下面issue,谢谢大佬: https://github.com/open-compass/opencompass/issues/1634