EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
6.9k stars 1.84k forks source link

New Task Request: InflectionAI's Physics GRE #1554

Open RylanSchaeffer opened 8 months ago

RylanSchaeffer commented 8 months ago

There is a new dataset of Physics GRE exams constructed by Inflection AI: https://github.com/InflectionAI/Inflection-Benchmarks?tab=readme-ov-file#physics-gre

Adding this task would be a nice addition :)

ShayekhBinIslam commented 7 months ago

@haileyschoelkopf I am interested in contributing here by adding this task if possible.