google-deepmind / gemma

Open weights LLM from Google DeepMind.
http://ai.google.dev/gemma
Apache License 2.0
2.39k stars 298 forks source link

Reproducing evaluations #42

Open adiprasad opened 1 month ago

adiprasad commented 1 month ago

Trying to reproduce evaluation numbers but not able to.

Ex : For gemma-2-9b, the technical report mentions 68.2 on BBH 3 shot CoT while the open llm leaderboard reported 5.05

Was there any special setting used for evaluations ?

adiprasad commented 1 month ago

@tilakrayal any update on this ?