Closed EwoutH closed 2 months ago
Some benchmarks are showing up for various quants already using chigkim/Ollama-MMLU-Pro posted on this reddit sub r/LocalLLaMA thread.
Looks promising so far...
Thank you for your suggestions. We have updated the evaluation results for the Qwen2.5 series models in the leaderboard.
Just to save future folks a few clicks, the leaderboard is here: https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro
We have updated the evaluation results for the Qwen2.5 series models in the leaderboard.
Awesome! I saw the results are self-reported, are you planning on validating (one or more of) the Qwen2.5 models?
Tracking issue for the Qwen2.5 model family. These models are SOTA in their sizes on many benchmarks, and most are released under an permissive Apache 2.0 license.
Models on HuggingFace: Qwen2.5 | Qwen2.5-Coder | Qwen2.5-Math.
There are a lot of model in total, but I would add (at least):
There are also instruct-tuned models available for most models.