TIGER-AI-Lab / MMLU-Pro

The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
Apache License 2.0
133 stars 22 forks source link

Add Ministral 3B and 8B #34

Closed EwoutH closed 1 month ago

EwoutH commented 1 month ago

Ministral 3B and 8B are SOTA edge models by Mistral AI: https://mistral.ai/news/ministraux/

It would be great to have them on the leaderboard!

Wyyyb commented 1 month ago

FYI, the 8B model has been updated in the leaderboard. The 3B model's overall accuracy is only around 10%, which doesn't exceed the random interval, so it hasn't been added to the leaderboard.