open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
Apache License 2.0
1.08k stars 154 forks source link

[Minor] Fix MMAlaya2 #420

Closed kennymckormick closed 1 month ago

kennymckormick commented 1 month ago

When max_num = 24, the model requires two 80GB GPU for evaluation. Otherwise, it only requires one 80GB GPU.

For multi-image scenarios, the max_num for each image is max(1, max_num / num_image)