Closed lj163ucas closed 8 months ago
Hi, @lj163ucas , We are still evaluating those VLMs on MME and SEEDBench, the results will be available in 1 / 2 weeks. The data on the previous page are just mock data.
赞,多谢
Is it going well?
Hi, @lj163ucas , the evaluation of those benchmarks are now supported in VLMEvalKit (including the evaluation results). Besides, we would add a multi-modal leaderboard in the following several days to our official website.
https://opencompass.readthedocs.io/zh-cn/latest/advanced_guides/multimodal_eval.html中的多模态评测使用的是opencompass中的python run.py configs/multimodal/tasks.py --mm-eval,这部分支持测试吗?目前测试报错,榜单中提到使用的是VLMEvalKit
https://opencompass.readthedocs.io/zh-cn/latest/advanced_guides/multimodal_eval.html中的多模态评测使用的是opencompass中的python run.py configs/multimodal/tasks.py --mm-eval,这部分支持测试吗?目前测试报错,榜单中提到使用的是VLMEvalKit
Please try VLMEvalKit, evaluation for VLM has been deprecated in opencompass repo
Describe the feature
之前多模态榜单上好像有好几个bench一起评测(类似LLM榜单上好几个数据集综合评测),现在只有MMBench了。请问其他数据集评测是迁移到哪里了吗?有没有可能恢复回来呀?
Will you implement it?