allenai / reward-bench

RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
277 stars 27 forks source link

[Model Request] mightbe/Better-PairRM #102

Closed StableFluffy closed 2 months ago

StableFluffy commented 2 months ago

https://huggingface.co/mightbe/Better-PairRM

I made few fixes on PairRM's dataset filter process and truncate etc..

As a result I got at least 15% performance.

Code and prompt template on huggingface repo.

Thank you.

natolambert commented 2 months ago

@StableFluffy really cool. Do you have time to open a PR that extends the existing code in models/pairrm.py or make sure inference works? Looks like it shouldn't be too bad.

natolambert commented 2 months ago

Closed with #104