allenai / reward-bench

RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
277 stars 27 forks source link

bon eval #111

Open yuchenlin opened 2 months ago

natolambert commented 2 months ago

Will give a more thorough review once the data is moved. You can use the function save_to_hub to easily do this. Would be good to just add the data conversion scripts in the source rewardbench/ repo, then call them in the script so it's all automatic end to end.