Open appvoid opened 4 months ago
Which evaluation pipeline are you using?
Hey @liuzechun evals are wrong in the README. The paper:
vs the README
What happened is the column headers were switched but the numbers remained the same. For example, in the README it says HellaSwag for MobileLLM-LS-125M
is 39.5 but in the README it is 65.7 which in the paper corresponds to PIQA.
i think the readme.md has some issues regarding to the evals
i just notice it with piqa, the numbers are too low compared to the actual paper