allenai / reward-bench

RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
440 stars 52 forks source link

Tiny dtype fix #197

Closed sanderland closed 1 month ago

sanderland commented 1 month ago

When a model gives a bfloat16, .numpy() can be unhappy. This happens for e.g. Skywork.