allenai / reward-bench

RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
277 stars 27 forks source link

Add PoLL for generative RM #118

Closed natolambert closed 2 months ago

natolambert commented 2 months ago

See this paper: essentially ensemble LLM as a judge https://arxiv.org/abs/2404.18796