allenai reward-bench issues

allenai / reward-bench

RewardBench: the first evaluation tool for reward models.

https://huggingface.co/spaces/allenai/reward-bench

Apache License 2.0

440 stars 52 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add generative models

#156 natolambert closed 4 months ago
0
Add bfloat16 support natively

#155 natolambert closed 4 months ago
0
fix padding for GRM class

#154 YangRui2015 closed 4 months ago
2
Add Claude 3.5 Sonnet

#153 natolambert closed 4 months ago
1
New models + dockerfile

#152 natolambert closed 4 months ago
0
Add GRM classes

#151 YangRui2015 closed 4 months ago
0
Add New reward models

#150 YangRui2015 closed 4 months ago
2
Model Test Application

#149 wjxxyz closed 4 months ago
3
Fix small bugs

#148 natolambert closed 4 months ago
0
'model_modifier' referenced before assignment in enclosing scope

#147 tianlu-wang closed 4 months ago
1
possibly a typo in `load_bon_dataset.py`

#146 mickelliu closed 4 months ago
0
Fix llama3 quantization for DPO models

#145 natolambert closed 5 months ago
0
Minor fixes, new dockerfile, new models

#144 natolambert closed 5 months ago
0
New Gemma-7b DPO Model

#143 ajseo17 closed 4 months ago
12
Fix DPO prompts

#142 natolambert closed 5 months ago
0
New super secret models

#141 natolambert closed 5 months ago
1
Prompt Repeated in DPO `tokenize_row` (not actually sure if this is an issue)

#140 PootieT closed 5 months ago
3
Clean, minor fixes, and release 0.1.2

#139 natolambert closed 5 months ago
0
Determinism experiments

#138 natolambert closed 5 months ago
0
rewardbench.py results are different for different batch size for beaver-7b

#137 andrewsiah closed 5 months ago
43
Prepare for submission

#136 natolambert closed 5 months ago
0
Add ArmoRM to RewardBench

#135 Haoxiang-Wang closed 6 months ago
0
Do we need to add system prompt when training/evaluating RM?

#134 hank0316 closed 6 months ago
1
Gemini prompt for llm-as-a-judge

#133 natolambert closed 5 months ago
0
Logging some new models

#132 natolambert closed 5 months ago
0
Fixes to analysis scripts

#131 natolambert closed 6 months ago
0
Set up OpenRouter for llm-as-a-judge

#130 natolambert closed 5 months ago
1
Mixed bag of fixes / updates

#129 natolambert closed 6 months ago
0
update the __call__ for slicpairpm

#128 WeiXiongUST closed 6 months ago
2
Update the parameters of __call__ for Slic pair PM so that the test runs smoothly

#127 WeiXiongUST closed 6 months ago
1
Update pip Installation Command

#126 Haoxiang-Wang closed 6 months ago
0
Add multi-gpu inference option

#125 natolambert opened 6 months ago
2
Improve run_generative documentation + add to pip

#124 natolambert closed 6 months ago
0
[Add Model] Pairwise Preference Model

#123 WeiXiongUST closed 6 months ago
3
Add generative models to pip install (probably with optional dependencies)

#122 natolambert closed 6 months ago
0
Make RewardBench pip installable + runable!

#121 natolambert closed 6 months ago
0
Ensemble pre-computed reward outputs

#120 natolambert closed 6 months ago
0
Implement PoLL (LLM-as-a-judge ensembles)

#119 natolambert closed 6 months ago
0
Add PoLL for generative RM

#118 natolambert closed 6 months ago
0
Add `pad_token_id` from tokenizer to model config.

#117 hank0316 closed 7 months ago
2
Clarification Needed on DPO Reward Evaluation

#116 ZHZisZZ closed 7 months ago
4
`pad_token_id` issue

#115 hank0316 closed 7 months ago
7
Add `rewardbench` on pypi + basic release management

#114 natolambert closed 6 months ago
0
Use VLLM for llm as a judge on open models

#113 natolambert closed 7 months ago
0
New week's models + fixes

#112 natolambert closed 7 months ago
1
bon eval

#111 yuchenlin closed 1 month ago
1
New LLaMA-3 Seq. Classfier Model

#110 hendrydong closed 7 months ago
6
Fix type hint

#109 PavelCz closed 7 months ago
0
Missed commit

#108 natolambert closed 7 months ago
0
Update paths

#107 natolambert closed 7 months ago
0

Previous Next