reward-bench for Reward Model

NVIDIA / NeMo-Aligner

Scalable toolkit for efficient model alignment

Apache License 2.0

516 stars 56 forks source link

reward-bench for Reward Model #230

Open lss11005 opened 2 months ago

lss11005 commented 2 months ago

After train RM（step1-step3） with steerLM，I'll get reward model(.nemo), is it as the final reward model?

Nemotron-4-340B technical report show the perfermance of reward model based on reward-bench Can you share the specific eval method by reward-bench, such as model convert step(nemo->hf) and parameter configuration during testing (chat_template, ...)

Can I replace base model to train reward model, such as Mistral-7b, which parameters should be modeified?

berserkr commented 1 month ago

I suspect once Nemo models are in Transformers, it will be easier to create a pipeline to run RB. In the meantime I have a simple hack: https://github.com/berserkr/NeMo-Aligner/blob/main/examples/nlp/gpt/nemo_bench.py :) You run it the same way you would run inference.