open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
Apache License 2.0
1.08k stars 154 forks source link

[Dataset] Add GQA TestDev Balanced Split to Dataset #431

Closed Mor-Li closed 3 weeks ago

Mor-Li commented 4 weeks ago

This pull request adds the GQA TestDev Balanced split, which contains 12,578 data entries, to the dataset configurations. The split is commonly used for evaluation purposes and has been integrated into the ImageVQADataset class, including its corresponding URL and MD5 hash. The evaluation method has also been updated to support accuracy calculations for GQA.

image

For more details on the GQA dataset, refer to the GQA paper.