FreedomIntelligence / MLLM-Bench

MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
54 stars 3 forks source link

Where can I find answers? #2

Closed Hambaobao closed 10 months ago

Hambaobao commented 10 months ago

Hello, thanks for your great work. May I ask where can I find answers to these questions to evaluate my model? 🤥

g-h-chen commented 10 months ago

Hi there, as introduced in our paper, our benchmark consists of 420 open-ended questions. To conduct an evaluation, you'll need to use GPT-4V, feeding it with an image, a question and a pair of answers that you intend to compare.