THUDM / AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
https://llmbench.ai
Apache License 2.0
2.15k stars 150 forks source link

How to calculate the overall score? #79

Closed zhimin-z closed 10 months ago

zhimin-z commented 10 months ago

It seems the overall score is not the average of the single task score, how to get it after all? image

zhimin-z commented 10 months ago

image Found it!