issues
search
allenai
/
reward-bench
RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
440
stars
52
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Is eval set on huggingface the eval set or train set?
#106
andrewsiah
closed
7 months ago
1
Update table loading
#105
ljvmiranda921
closed
7 months ago
0
[Add Model] Better-PairRM + Relative path
#104
StableFluffy
closed
7 months ago
4
Bump black from 23.1.0 to 24.3.0
#103
dependabot[bot]
closed
7 months ago
0
[Model Request] mightbe/Better-PairRM
#102
StableFluffy
closed
7 months ago
2
Newest week's models
#101
natolambert
closed
7 months ago
0
adding kto as a separate category
#100
kawine
closed
7 months ago
4
More models
#99
natolambert
closed
7 months ago
0
Saving fix
#98
natolambert
closed
7 months ago
0
Minor run_rm.py fixes
#97
PavelCz
closed
7 months ago
3
DPO ref free sweep prep
#96
natolambert
closed
7 months ago
1
multi gpu inference with run_rm.py
#95
SeungoneKim
closed
5 months ago
3
Fix EOS token bug on FastChat models (non DPO)
#94
natolambert
closed
7 months ago
0
Experiment request: DPO with different betas
#93
natolambert
closed
7 months ago
1
Visualization requests
#92
natolambert
closed
4 months ago
1
Output leaderboard scores when running `run_rm.py`
#91
natolambert
closed
7 months ago
0
Check EOS token on FastChat models
#90
natolambert
closed
7 months ago
1
Saving bug (non breaking)
#89
natolambert
closed
7 months ago
0
Dataset v2 discussion & feedback
#88
natolambert
opened
8 months ago
4
[Core team] Migrate Prior Sets to 50% weight
#87
natolambert
closed
7 months ago
1
Initial generative RM implementation (via API)
#86
natolambert
closed
7 months ago
1
New week new models
#85
natolambert
closed
7 months ago
0
adding Archangel models (dpo, kto, sft+dpo, sft+kto)
#84
kawine
closed
7 months ago
0
Rename Starling 34B
#83
natolambert
closed
7 months ago
0
Clean up / enhance DPO code
#82
natolambert
closed
4 months ago
1
stanfordnlp/SteamSHP-flan-t5 performance on SHP and HH-RLHF Helpful
#81
timbmg
closed
8 months ago
1
Add new model weqweasdas/RM-Mistral-7B
#80
WeiXiongUST
closed
8 months ago
0
Add a new mistral RM model
#79
hendrydong
closed
8 months ago
1
Add models, refactor eval configs, fix beaver cost
#78
natolambert
closed
8 months ago
1
Add new model Mistral-7B-instruct-Unified-Feedback
#77
YangRui2015
closed
8 months ago
0
Add Nvidia RMs (and Nemo compatibility)
#76
natolambert
closed
4 months ago
2
Check beaver cost model
#75
natolambert
closed
8 months ago
1
Update train_rm.py
#74
eltociear
closed
8 months ago
0
update paper
#73
natolambert
closed
8 months ago
0
Include MT Bench score figure
#72
ljvmiranda921
closed
8 months ago
0
Improve model distribution
#71
ljvmiranda921
closed
8 months ago
0
Small PR to add OLMo Instruct
#70
natolambert
closed
8 months ago
0
Add contributing models text
#69
natolambert
closed
8 months ago
1
Auto-rotate the column names so that it's easier to copy
#68
ljvmiranda921
closed
8 months ago
0
Plot distribution of RM scores for each RM
#67
natolambert
closed
8 months ago
0
Small nits
#66
natolambert
closed
8 months ago
0
Refactor visualization
#65
ljvmiranda921
closed
8 months ago
0
Add name substitution to benchmark results
#64
ljvmiranda921
closed
8 months ago
0
Minor README fix
#63
ljvmiranda921
closed
8 months ago
0
Nit to length table
#62
natolambert
closed
8 months ago
0
Paper ready plot... for appendix at least
#61
natolambert
closed
8 months ago
0
Configs & release docker
#60
natolambert
closed
8 months ago
0
Cleanup of auxiliary scripts
#59
ljvmiranda921
closed
8 months ago
1
Remove PKU and other cleaning
#58
natolambert
closed
8 months ago
0
WIP Code to replace one subset (if we want it)
#57
natolambert
closed
8 months ago
0
Previous
Next