allenai / reward-bench

RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
440 stars 52 forks source link

bon eval #111

Closed yuchenlin closed 1 month ago

natolambert commented 7 months ago

Will give a more thorough review once the data is moved. You can use the function save_to_hub to easily do this. Would be good to just add the data conversion scripts in the source rewardbench/ repo, then call them in the script so it's all automatic end to end.