issues
search
allenai
/
reward-bench
RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
375
stars
47
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
DPO ref free sweep prep
#96
natolambert
closed
6 months ago
1
multi gpu inference with run_rm.py
#95
SeungoneKim
closed
3 months ago
3
Fix EOS token bug on FastChat models (non DPO)
#94
natolambert
closed
6 months ago
0
Experiment request: DPO with different betas
#93
natolambert
closed
5 months ago
1
Visualization requests
#92
natolambert
closed
3 months ago
1
Output leaderboard scores when running `run_rm.py`
#91
natolambert
closed
5 months ago
0
Check EOS token on FastChat models
#90
natolambert
closed
6 months ago
1
Saving bug (non breaking)
#89
natolambert
closed
6 months ago
0
Dataset v2 discussion & feedback
#88
natolambert
opened
6 months ago
4
[Core team] Migrate Prior Sets to 50% weight
#87
natolambert
closed
5 months ago
1
Initial generative RM implementation (via API)
#86
natolambert
closed
6 months ago
1
New week new models
#85
natolambert
closed
6 months ago
0
adding Archangel models (dpo, kto, sft+dpo, sft+kto)
#84
kawine
closed
6 months ago
0
Rename Starling 34B
#83
natolambert
closed
6 months ago
0
Clean up / enhance DPO code
#82
natolambert
closed
3 months ago
1
stanfordnlp/SteamSHP-flan-t5 performance on SHP and HH-RLHF Helpful
#81
timbmg
closed
6 months ago
1
Add new model weqweasdas/RM-Mistral-7B
#80
WeiXiongUST
closed
6 months ago
0
Add a new mistral RM model
#79
hendrydong
closed
6 months ago
1
Add models, refactor eval configs, fix beaver cost
#78
natolambert
closed
6 months ago
1
Add new model Mistral-7B-instruct-Unified-Feedback
#77
YangRui2015
closed
6 months ago
0
Add Nvidia RMs (and Nemo compatibility)
#76
natolambert
closed
3 months ago
1
Check beaver cost model
#75
natolambert
closed
6 months ago
1
Update train_rm.py
#74
eltociear
closed
6 months ago
0
update paper
#73
natolambert
closed
6 months ago
0
Include MT Bench score figure
#72
ljvmiranda921
closed
6 months ago
0
Improve model distribution
#71
ljvmiranda921
closed
6 months ago
0
Small PR to add OLMo Instruct
#70
natolambert
closed
6 months ago
0
Add contributing models text
#69
natolambert
closed
6 months ago
1
Auto-rotate the column names so that it's easier to copy
#68
ljvmiranda921
closed
6 months ago
0
Plot distribution of RM scores for each RM
#67
natolambert
closed
6 months ago
0
Small nits
#66
natolambert
closed
6 months ago
0
Refactor visualization
#65
ljvmiranda921
closed
6 months ago
0
Add name substitution to benchmark results
#64
ljvmiranda921
closed
6 months ago
0
Minor README fix
#63
ljvmiranda921
closed
6 months ago
0
Nit to length table
#62
natolambert
closed
6 months ago
0
Paper ready plot... for appendix at least
#61
natolambert
closed
6 months ago
0
Configs & release docker
#60
natolambert
closed
6 months ago
0
Cleanup of auxiliary scripts
#59
ljvmiranda921
closed
6 months ago
1
Remove PKU and other cleaning
#58
natolambert
closed
6 months ago
0
WIP Code to replace one subset (if we want it)
#57
natolambert
closed
6 months ago
0
Remove excess requirement
#56
natolambert
closed
7 months ago
0
Update README.md
#55
ljvmiranda921
closed
7 months ago
0
Rename to RewardBench
#54
natolambert
closed
7 months ago
0
Jacob morrison patch 1
#53
jacob-morrison
closed
7 months ago
0
Plotting fixes / improvements
#52
natolambert
closed
7 months ago
0
Support Nous Mixtral
#51
natolambert
closed
6 months ago
1
Add trust remote code to tokenizer
#50
natolambert
closed
7 months ago
0
fix beaker save
#49
jacob-morrison
closed
7 months ago
0
Check Qwen model
#48
natolambert
closed
6 months ago
1
Nits, fixes, and Qwen
#47
natolambert
closed
7 months ago
0
Previous
Next