open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
Apache License 2.0
1.34k stars 188 forks source link

[Dataset] Add GQA TestDev Balanced Split to Dataset #431

Closed Mor-Li closed 2 months ago

Mor-Li commented 2 months ago

This pull request adds the GQA TestDev Balanced split, which contains 12,578 data entries, to the dataset configurations. The split is commonly used for evaluation purposes and has been integrated into the ImageVQADataset class, including its corresponding URL and MD5 hash. The evaluation method has also been updated to support accuracy calculations for GQA.

image

For more details on the GQA dataset, refer to the GQA paper.