Closed cosmojg closed 7 months ago
Thx for the suggestions. I actually did some of these already, just added the latest to the leaderboard.
wolfram/miquliz-120b-v2.0 82.21
alpindale/miquella-120b 82.15
migtissera/Tess-72B-v1.5b 81.78
I'll see if I can get TheProfessor to run, that one looks like a beast. Not sure if I'll get to lzlv & miquliz soon but they're on my list.
@sqrkl Oh sweet, that's awesome! Thank you for all of your hard work! Out of curiosity, what kind of hardware are you using to run these benchmarks?
I spin up instances on runpod or vast.ai. My local hardware is not really capable of most of these models.
Just added TheProfessor-155b.
It would be really cool to see how some of the more recent megamerges stack up against the rest of the competition, especially those which build on top of miqu-1-70b. Specifically, I think it would be interesting to benchmark the EQ of the following models:
miqu-1-120b (miqu-1-70b + miqu-1-70b)
miquliz-120b-v2.0 (miqu-1-70b + lzlv_70b)
lzlv_70b (for comparison with its descendants)
And if it's not way too big to benchmark...