issues
search
allenai
/
reward-bench
RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
440
stars
52
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add generative models
#156
natolambert
closed
4 months ago
0
Add bfloat16 support natively
#155
natolambert
closed
4 months ago
0
fix padding for GRM class
#154
YangRui2015
closed
4 months ago
2
Add Claude 3.5 Sonnet
#153
natolambert
closed
4 months ago
1
New models + dockerfile
#152
natolambert
closed
4 months ago
0
Add GRM classes
#151
YangRui2015
closed
4 months ago
0
Add New reward models
#150
YangRui2015
closed
4 months ago
2
Model Test Application
#149
wjxxyz
closed
4 months ago
3
Fix small bugs
#148
natolambert
closed
4 months ago
0
'model_modifier' referenced before assignment in enclosing scope
#147
tianlu-wang
closed
4 months ago
1
possibly a typo in `load_bon_dataset.py`
#146
mickelliu
closed
4 months ago
0
Fix llama3 quantization for DPO models
#145
natolambert
closed
5 months ago
0
Minor fixes, new dockerfile, new models
#144
natolambert
closed
5 months ago
0
New Gemma-7b DPO Model
#143
ajseo17
closed
4 months ago
12
Fix DPO prompts
#142
natolambert
closed
5 months ago
0
New super secret models
#141
natolambert
closed
5 months ago
1
Prompt Repeated in DPO `tokenize_row` (not actually sure if this is an issue)
#140
PootieT
closed
5 months ago
3
Clean, minor fixes, and release 0.1.2
#139
natolambert
closed
5 months ago
0
Determinism experiments
#138
natolambert
closed
5 months ago
0
rewardbench.py results are different for different batch size for beaver-7b
#137
andrewsiah
closed
5 months ago
43
Prepare for submission
#136
natolambert
closed
5 months ago
0
Add ArmoRM to RewardBench
#135
Haoxiang-Wang
closed
6 months ago
0
Do we need to add system prompt when training/evaluating RM?
#134
hank0316
closed
6 months ago
1
Gemini prompt for llm-as-a-judge
#133
natolambert
closed
5 months ago
0
Logging some new models
#132
natolambert
closed
5 months ago
0
Fixes to analysis scripts
#131
natolambert
closed
6 months ago
0
Set up OpenRouter for llm-as-a-judge
#130
natolambert
closed
5 months ago
1
Mixed bag of fixes / updates
#129
natolambert
closed
6 months ago
0
update the __call__ for slicpairpm
#128
WeiXiongUST
closed
6 months ago
2
Update the parameters of __call__ for Slic pair PM so that the test runs smoothly
#127
WeiXiongUST
closed
6 months ago
1
Update pip Installation Command
#126
Haoxiang-Wang
closed
6 months ago
0
Add multi-gpu inference option
#125
natolambert
opened
6 months ago
2
Improve run_generative documentation + add to pip
#124
natolambert
closed
6 months ago
0
[Add Model] Pairwise Preference Model
#123
WeiXiongUST
closed
6 months ago
3
Add generative models to pip install (probably with optional dependencies)
#122
natolambert
closed
6 months ago
0
Make RewardBench pip installable + runable!
#121
natolambert
closed
6 months ago
0
Ensemble pre-computed reward outputs
#120
natolambert
closed
6 months ago
0
Implement PoLL (LLM-as-a-judge ensembles)
#119
natolambert
closed
6 months ago
0
Add PoLL for generative RM
#118
natolambert
closed
6 months ago
0
Add `pad_token_id` from tokenizer to model config.
#117
hank0316
closed
7 months ago
2
Clarification Needed on DPO Reward Evaluation
#116
ZHZisZZ
closed
7 months ago
4
`pad_token_id` issue
#115
hank0316
closed
7 months ago
7
Add `rewardbench` on pypi + basic release management
#114
natolambert
closed
6 months ago
0
Use VLLM for llm as a judge on open models
#113
natolambert
closed
7 months ago
0
New week's models + fixes
#112
natolambert
closed
7 months ago
1
bon eval
#111
yuchenlin
closed
1 month ago
1
New LLaMA-3 Seq. Classfier Model
#110
hendrydong
closed
7 months ago
6
Fix type hint
#109
PavelCz
closed
7 months ago
0
Missed commit
#108
natolambert
closed
7 months ago
0
Update paths
#107
natolambert
closed
7 months ago
0
Previous
Next