Fix llama3 quantization for DPO models - Githubissues

allenai / reward-bench

RewardBench: the first evaluation tool for reward models.

https://huggingface.co/spaces/allenai/reward-bench

Apache License 2.0

442 stars 52 forks source link

Fix llama3 quantization for DPO models #145

Closed natolambert closed 5 months ago

natolambert commented 5 months ago

@hamishivi noticed this by running the llama3 tulus 🫠