allenai reward-bench issues

allenai / reward-bench

RewardBench: the first evaluation tool for reward models.

https://huggingface.co/spaces/allenai/reward-bench

Apache License 2.0

375 stars 47 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Fix chat template issues

#46 natolambert closed 7 months ago
0
Train rm

#45 jacob-morrison closed 7 months ago
0
deleted some nits and initialized a variable

#44 ValentinaPy closed 7 months ago
0
dpo nits

#43 ValentinaPy closed 7 months ago
1
fixes!

#42 natolambert closed 7 months ago
0
Set default chat template to None

#41 natolambert closed 7 months ago
1
Actual fix + DPO Ref Free

#40 natolambert closed 7 months ago
0
small dpo fixes

#39 natolambert closed 7 months ago
0
WIP: Add experiments on per-token reward

#38 ljvmiranda921 opened 7 months ago
0
Pref Sets updates

#37 natolambert closed 7 months ago
1
Clean up model loading system

#36 natolambert closed 7 months ago
0
Truncation of long sequences

#35 natolambert closed 7 months ago
1
Fix score saving PairRM and SteamSHP

#34 natolambert closed 7 months ago
2
Add linechart capability for per-token rewards

#33 ljvmiranda921 closed 7 months ago
2
Plot subset distribution across all models

#32 natolambert closed 7 months ago
0
DPO

#31 ValentinaPy closed 7 months ago
0
Best of N pipeline + tests

#30 natolambert closed 7 months ago
0
Per token multiple rms

#29 khyathiraghavi closed 7 months ago
0
Per token multiple rms

#28 khyathiraghavi closed 7 months ago
0
visualizing multiple rewards

#27 khyathiraghavi closed 7 months ago
0
Add model type to results

#26 natolambert closed 7 months ago
1
Update per token reward

#25 ljvmiranda921 closed 7 months ago
2
Experiment with human vs gpt4 data

#24 natolambert opened 7 months ago
1
Clean repo

#23 natolambert closed 7 months ago
0
Improve per-token reward tool

#22 natolambert closed 7 months ago
0
Save scores per prompt

#21 natolambert closed 7 months ago
0
Change data storage location

#20 natolambert closed 7 months ago
1
Docker eval into my training pr

#19 jacob-morrison closed 7 months ago
0
Add docker image and script for submitting eval jobs

#18 jacob-morrison closed 7 months ago
0
Add function to get subtoken statistics

#17 ljvmiranda921 closed 7 months ago
3
Add function to get subtoken statistics

#16 ljvmiranda921 closed 7 months ago
3
Fix code formatting

#15 ljvmiranda921 closed 7 months ago
1
Fix table loading from viewer updates

#14 ljvmiranda921 closed 7 months ago
0
Beaver fix; working towards another model

#13 natolambert closed 7 months ago
4
Add Beaver model from PKU-Alignment

#12 natolambert closed 7 months ago
0
Best of N benchmark

#11 natolambert closed 1 month ago
2
Add markdown / LaTeX tables

#10 ljvmiranda921 closed 8 months ago
2
Print per-token reward over an RM

#9 natolambert closed 8 months ago
1
Fix test failing on main

#8 natolambert closed 8 months ago
0
Updates histogram + prints table

#7 natolambert closed 8 months ago
0
Add function to draw histograms on the evaluation dataset

#6 ljvmiranda921 closed 8 months ago
6
chore: Make .gitignore more comprehensive

#5 ljvmiranda921 closed 8 months ago
1
Save reward scores for each prompt

#4 natolambert closed 7 months ago
0
Generative RM

#3 natolambert closed 6 months ago
0
DATASET TRACKING

#2 natolambert closed 8 months ago
1
Multiple styles of computing reward with DPO

#1 natolambert closed 6 months ago
1