open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
Apache License 2.0
1.39k stars 194 forks source link

llava-onevision multi-image #607

Open jun0wanan opened 1 week ago

jun0wanan commented 1 week ago

llava-ov可以处理多图吗?不知道这个评测是否融入了多图还是只能单图或者视频?

kennymckormick commented 6 days ago

Hi, @jun0wanan ,

LLAVA-OV supports multi-image. Under VLMEvalKit, there exists benchmarks like BLINK that adopts multi-image setting.