open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
Apache License 2.0
1.3k stars 182 forks source link

What is the estiamted runtime across benchmarks, and OpenAI api cost? #33

Closed findalexli closed 9 months ago

kennymckormick commented 10 months ago

Hi, @findalexli ,

  1. The runtime depends on the architecture of your VLM and may vary across different VLMs. To demonstrate a data point, evaluating llava-v1.5-7b on MMBench_DEV_EN may cost 15 minutes on a single A100.
  2. The OpenAI API cost depends on the nature of the task and the instruction following capability of your model (after all, if an VLM follows the instruction perfectly, no GPT post-processing is required). For some data points, evaluating different VLMs on MMBench_EN (dev + test) will cost < $1.5 on average. For models like llava-v1.5-7b, XComposer, no GPT cost will be spent during evaluation.