Q-Future / Q-Bench

①[ICLR2024 Spotlight] (GPT-4V/Gemini-Pro/Qwen-VL-Plus+16 OS MLLMs) A benchmark for multi-modality LLMs (MLLMs) on low-level vision and visual quality assessment.
https://q-future.github.io/Q-Bench/
Other
251 stars 12 forks source link

About the details of evaluating the model offline. #8

Closed rubylan closed 10 months ago

rubylan commented 1 year ago

Hello authors,

Good job! I'm wondering whether I can test my own model with q-bench offline. I've checked the A1 annotation json file and found an unknown field (which may have some relation with the sub-type distortion/others/in-context distortion/...). Also, the golden descriptions of A2 have not yet been released.

Thanks!

teowu commented 11 months ago

For A1, it has two additional fields: Type (0,1,2: corresponding to yes-no, what, how); and Concern (0,1,2,3: corresponding to distortion/others/in-context distortion/in-context others). For A1 dev, you can test it offline.

And yes, the A2 annotation is not yet released as it is kept private now. You may send an email to us to obtain the results.

Best Haoning