facebookresearch / MobileLLM

MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
Other
1.1k stars 60 forks source link

eval issues? #2

Open appvoid opened 3 months ago

appvoid commented 3 months ago

i think the readme.md has some issues regarding to the evals

i just notice it with piqa, the numbers are too low compared to the actual paper

liuzechun commented 3 months ago

Which evaluation pipeline are you using?

diegoasua commented 3 months ago

Hey @liuzechun evals are wrong in the README. The paper: image

vs the README

image

What happened is the column headers were switched but the numbers remained the same. For example, in the README it says HellaSwag for MobileLLM-LS-125M is 39.5 but in the README it is 65.7 which in the paper corresponds to PIQA.