allenai / reward-bench

RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
375 stars 47 forks source link

dpo nits #43

Closed ValentinaPy closed 7 months ago

natolambert commented 7 months ago

Can you rebase this PR? It's including all the changes from the previous PR.