allenai / reward-bench

RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
440 stars 52 forks source link

[Add Model] Better-PairRM + Relative path #104

Closed StableFluffy closed 7 months ago

StableFluffy commented 7 months ago

Added Better-PairRM support Change result saving path to relative.

Benchmark Result

model:"mightbe/Better-PairRM"
model_type:"Custom Classifier"
chat_template:"tulu"
alpacaeval-easy:0.98
alpacaeval-hard:1
alpacaeval-length:0.8631578947368421
donotanswer:0.5220588235294118
hep-cpp:0.6463414634146342
hep-go:0.7134146341463414
hep-java:0.7012195121951219
hep-js:0.7073170731707317
hep-python:0.7134146341463414
hep-rust:0.725609756097561
llmbar-adver-GPTInst:0.14130434782608695
llmbar-adver-GPTOut:0.44680851063829785
llmbar-adver-manual:0.2608695652173913
llmbar-adver-neighbor:0.27611940298507465
llmbar-natural:0.72
math-prm:0.2930648769574944
mt-bench-easy:0.9285714285714286
mt-bench-hard:0.7027027027027027
mt-bench-med:1
refusals-dangerous:0.73
refusals-offensive:0.94
xstest-should-refuse:0.9675324675324676
xstest-should-respond:0.876
natolambert commented 7 months ago

This is great @StableFluffy -- can you run

make style
make quality

and fix any minor things :)

StableFluffy commented 7 months ago

Done !

natolambert commented 7 months ago

@StableFluffy have you seen this error upon loading?

[INFO|modeling_utils.py:1491] 2024-04-11 08:46:15,579 >> Instantiating DebertaV2ForSequenceClassification model under default dtype torch.float16.
Traceback (most recent call last):
  File "/net/nfs.cirrascale/allennlp/nathanl/herm/scripts/run_rm.py", line 336, in <module>
    main()
  File "/net/nfs.cirrascale/allennlp/nathanl/herm/scripts/run_rm.py", line 169, in main
    model = model_builder(args.model, **model_kwargs, trust_remote_code=trust_remote_code)
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3671, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3947, in _load_pretrained_model
    model.apply(model._initialize_weights)
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 897, in apply
    module.apply(fn)
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 897, in apply
    module.apply(fn)
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 897, in apply
    module.apply(fn)
  [Previous line repeated 4 more times]
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 898, in apply
    fn(self)
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1704, in _initialize_weights
    self._init_weights(module)
  File "/net/nfs.cirrascale/allennlp/nathanl/miniconda3/envs/herm/lib/python3.10/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py", line 919, in _init_weights
    module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
RuntimeError: "normal_kernel_cpu" not implemented for 'Char'
natolambert commented 7 months ago

Another issue is

AssertionError: <source> id not in input_ids

When trying to run on --pref_sets