Open rgtjf opened 3 weeks ago
Thanks for your attention and recognition of our work. The partial evaluations and explanations of LLaMA3.1 and Qwen2 have already updated in info. However, due to the high costs(3537.6 USD) of GPT-4o evaluation, there are no plans for updates in the near future.
Thank you for the great work! I've noticed that most of the existing benchmarks are somewhat outdated. Is there any possibility of releasing these latest evaluations, for models like GPT-4o, LLaMA 3.1, and Qwen 2?