infinigence / LVEval

Repository of LV-Eval Benchmark
MIT License
43 stars 4 forks source link

Updated Benchmark Results, like GPT-4o, LLaMA 3.1, and Qwen 2 #3

Open rgtjf opened 3 weeks ago

rgtjf commented 3 weeks ago

Thank you for the great work! I've noticed that most of the existing benchmarks are somewhat outdated. Is there any possibility of releasing these latest evaluations, for models like GPT-4o, LLaMA 3.1, and Qwen 2?

yuantao2108 commented 2 weeks ago

Thanks for your attention and recognition of our work. The partial evaluations and explanations of LLaMA3.1 and Qwen2 have already updated in info. However, due to the high costs(3537.6 USD) of GPT-4o evaluation, there are no plans for updates in the near future.