RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.
https://rlhflow.github.io/
Apache License 2.0
407 stars 27 forks source link

Code for Armo on Reward Bench #15

Open philschmid opened 1 month ago

philschmid commented 1 month ago

Hello, i am curious if you plan to add code in https://github.com/RLHFlow/RLHF-Reward-Modeling/tree/main/useful_code for how you run the ARMO model for Reward Bench.

WeiXiongUST commented 1 month ago

@Haoxiang-Wang hi haoxiang, could you look into this?

Haoxiang-Wang commented 1 month ago

Hi @philschmid

For ArmoRM evaluation on RewardBench, I directly pushed an eval code to RewardBench, and you can check out the evaluation command at this PR: https://github.com/allenai/reward-bench/pull/135

philschmid commented 1 month ago

Hey thank you for sharing i opened a PR on your model card to make the usage for non expert a bit easier based on your Reward Bench PR, which uses the preference score for the response, aggregated from the multi-objective rewards with the gating layer.

https://huggingface.co/RLHFlow/ArmoRM-Llama3-8B-v0.1/discussions/3