allenai / WildBench

Benchmarking LLMs with Challenging Tasks from Real Users
https://huggingface.co/spaces/allenai/WildBench
Apache License 2.0
177 stars 25 forks source link