issues
search
allenai
/
reward-bench
RewardBench: the first evaluation tool for reward models.
https://huggingface.co/spaces/allenai/reward-bench
Apache License 2.0
442
stars
52
forks
source link
Fix llama3 quantization for DPO models
#145
Closed
natolambert
closed
5 months ago
natolambert
commented
5 months ago
@hamishivi noticed this by running the llama3 tulus ðŸ«
@hamishivi noticed this by running the llama3 tulus ðŸ«