TIGER-AI-Lab / MMLU-Pro

The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
Apache License 2.0
131 stars 22 forks source link

Add Tencent Hunyuan-Large #43

Open EwoutH opened 2 weeks ago

EwoutH commented 2 weeks ago

They claim a overal MMLU-Pro score of 60.2.

The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters. This is currently the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.

https://huggingface.co/tencent/Tencent-Hunyuan-Large

psych0v0yager commented 1 week ago

Agreed, this would be very interesting to see

NSbuilder commented 1 week ago

They claim a overal MMLU-Pro score of 60.2.

The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters. This is currently the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.

https://huggingface.co/tencent/Tencent-Hunyuan-Large

60.2 for the base, instruct probably better