EQ-bench / EQ-Bench

A benchmark for emotional intelligence in large language models
MIT License
180 stars 13 forks source link

default judge model setting for the leaderboard #21

Closed gyin94 closed 5 months ago

gyin94 commented 5 months ago

may I ask what the default judge model is?

sam-paech commented 5 months ago

For the creative writing leaderboard, it's claude-3-opus.

I will probably at some point make it an aggregate of multiple judges, since they all have a small amount of self-bias.

more info here: https://eqbench.com/about.html