huggingface / lighteval

LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
MIT License
461 stars 53 forks source link

Add EQ Bench #114

Open lewtun opened 3 months ago

lewtun commented 3 months ago

EQ Bench is a popular benchmark for chat models that aims to measure the emotional intelligence of LLMs. It is popular because it has high correlation with human preference evals like the LMSYS Chatbot Arena, but can be run at a fraction of the cost.

Leaderboard + code + paper: https://eqbench.com/about.html

clefourrier commented 3 months ago

This one is also model as a judge, so will need #75 first