openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Other
14.36k stars 2.55k forks source link

Add eval yaml for Theory of Mind eval #1453

Closed ojaffe closed 6 months ago

ojaffe commented 6 months ago

In the previous PR adding the Theory of Mind eval, the evals/registry/evals/theory_of_mind.yaml was mistakenly not added, so the eval couldn't be run. This PR adds this file.

Test with:

oaieval gpt-3.5-turbo theory_of_mind