Benchmark for mistral models

THUDM / AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

https://llmbench.ai

Apache License 2.0

2.01k stars 136 forks source link

Benchmark for mistral models #122

Open mingxuan-he opened 4 months ago

mingxuan-he commented 4 months ago

Curious here if anyone has done/planned for running agentbench on any mistral models yet?

Mixtral 8x7b seems like one of the best open source models at the moment and the new mistral large supposedly matches gpt-4. Fast inference providers like Groq and fireworks also offers mixtral inference, so definitely great symergies with agents use cases.

很棒的项目! 期待更新!

zhc7 commented 4 months ago

Thank you for your suggestion! We are looking forward to it, too.