Would llama3 wizardlm2 and other latest models be tested and published in leaderboard? 请求添加llama3 wizardlm等24年4-5月大模型的测试结果

THUDM / AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

https://llmbench.ai

Apache License 2.0

1.99k stars 135 forks source link

Open dercaft opened 1 month ago

dercaft commented 1 month ago

请求添加llama3 wizardlm等24年4-5月大模型的测试结果。当前的leaderboard榜单里的大模型感觉有点过时了，请问贵团队有计划测试24年最新的一批大模型吗？

zhc7 commented 1 month ago

hi AgentBench最新结果可以在https://fm.ai.tsinghua.edu.cn/superbench/#/leaderboard 这里找到