haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
19.14k stars 2.1k forks source link

[Feature request] Q-Bench (LLaVA-v1.5 excels on low-level vision abilities) #538

Closed teowu closed 10 months ago

teowu commented 11 months ago

feature

Hi Haotian,

A great pleasure to ask. Congratulations to this nice and solid work.

We are a team from NTU S-Lab, working on image/video quality assessment and recently propose a benchmark called Q-Bench, aiming to capture low-level perception and understanding abilities of MLLMs/VLLMs/LMMs. This is a sibling-like project to MMBench which is also for 4-question choices, while we find out that LLaVA-v1.5 reaches top-1 performance on its hot benchmark (at its release date), as shown in https://github.com/VQAssessment/Q-Bench/tree/master/leaderboards. Is that possible for us to write an evaluation script (similar to the existing one for MMBench) and merge into the evaluation scripts of LLaVA?

Best Haoning

haotian-liu commented 11 months ago

Sure, and that would be great! It would be helpful for the community to have more benchmarks to use conveniently in our code base.

teowu commented 10 months ago

Hi Haotian,

Thank you for the kind reply. Here is the pull request: https://github.com/haotian-liu/LLaVA/pull/581/commits. We modified four files, (**2 scripts, 1 evaluation code, and added a paragraph on the doc).

Thank you so much again.

Best, Haoning

haotian-liu commented 10 months ago

Thank you and sorry for the delay. We have merged the PR. Congratulations again on the great benchmark proposed!