关于医疗方面 MedBench，在连接模型测试时的问题

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Apache License 2.0

3.78k stars 405 forks source link

Describe the feature

对于医疗方面的 MedBench 排行，https://medbench.opencompass.org.cn/leaderboard ，有点问题，请opencompass的相关朋友解答下： 1、opencompass排行中，有一个qwen-chat-72b条目，这是本地部署了开源的qwen-chat-72b 然后测试的？还是连接阿里的api测试的呢？如果是本地部署的开源的qwen-chat，那用的是哪个版本，v1、v1.5版本？如果是连api，那么在线连接的是哪个版本的模型呢？开源还是闭源的呢？ 2、什么时候测试的呢？我看发布日期和更新日期都是2024/02/20，这里标的“发布日期”、“更新日期”分别是指啥呢？ 3、在做这个MedBench 测试时，具体连接各个模型或api的相关代码，是在哪个文件里呢？

Will you implement it?

[x] I would like to implement this feature and create a PR!

open-compass / opencompass

关于 医疗方面 MedBench， 在连接模型测试时的问题 #905

Describe the feature

Will you implement it?

关于医疗方面 MedBench，在连接模型测试时的问题 #905