Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development
https://llama2-accessory.readthedocs.io/
Other
2.72k stars 176 forks source link

Clarification of SEED score #194

Closed Isaachhh closed 6 months ago

Isaachhh commented 6 months ago
Screenshot 2024-04-30 at 14 19 23

I notice that SPHINX achieve 70+ on SEED-Bench. I wonder whether you report image accuracy other than total accuracy, because LLaVA reports total accuracy (58.6 for 7B and 61.6 for 13B).

Artanic30 commented 6 months ago

Hi, we report the seedbench image accuracy for SPHINX. The results are verified by the authors of Seedbench. You may refer to the SEED-Bench Leaderboard for more information.

Isaachhh commented 6 months ago

I see.

But if you report seedbench image accuracy, the scores of LLaVA are wrong. You should also report image accuracy.

image

Artanic30 commented 6 months ago

Thanks for pointing out, it's a mistake in our paper. We will correct the results of LLaVA in latest version.