lmarena / arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.
Apache License 2.0
665 stars 76 forks source link

Can you add deepseek-coder-v2? #29

Closed Kreijstal closed 5 months ago

Kreijstal commented 5 months ago

afaik is the best open source model, no? Also I would like to see claude 3.5 gpt4o and qwen2

CodingWithTim commented 5 months ago

We will add these models and release a official leaderboard very soon. In the meantime, you can look up their Arena-Hard score on their blogpost. DeepSeek-Coder-v2 and Qwen2-72B-Instruct both mentioned their Arena-Hard score in their release notes.