allenai / reward-bench

RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
374 stars 47 forks source link

New week's models + fixes #112

Closed natolambert closed 5 months ago

natolambert commented 5 months ago

Closes #110 (new model), closes #100 (docs)

natolambert commented 5 months ago

Also closes #91