Add maj@k metric - Githubissues

huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

MIT License

689 stars 78 forks source link

Closed clefourrier closed 5 months ago

clefourrier commented 5 months ago

Still missing:

[x] merge greedy sampling with the basic greedy evals, to avoid doing generations too many times - we can do sampled generation + normal generations in one step and split after on the num samples needed when we apply the metrics - will divide time taken by 2
[x] update all the other launchers - inference endpoints, nanotron, etc - they don't cover it XD

clefourrier commented 5 months ago

I'll check it - however, please don't merge yet, I need to propagate the changes to the other lauchers + simplify greedy :)

NathanHB commented 5 months ago

Oh mb I thought it ready for review !

clefourrier commented 5 months ago

I hope I'll finish it today :) It'll be a bit bigger because I'm simplifying part of the current system

clefourrier commented 5 months ago

Tests failing because of the hub problems ^^"""

NathanHB commented 5 months ago

Is this good to be merged ? I saw on slack you did not get the same results for mistral models. (do we even know how they ran their tests ?)

clefourrier commented 5 months ago

I'll do more tests today - we could merge it now and adjust later if needed though.

NathanHB commented 5 months ago

Alright i'm merging it so that we can merge the other PRs