Clarification of SEED score

Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development

https://llama2-accessory.readthedocs.io/

Other

2.7k stars 170 forks source link

Clarification of SEED score #194

Closed Isaachhh closed 5 months ago

Isaachhh commented 5 months ago

I notice that SPHINX achieve 70+ on SEED-Bench. I wonder whether you report image accuracy other than total accuracy, because LLaVA reports total accuracy (58.6 for 7B and 61.6 for 13B).

Artanic30 commented 5 months ago

Hi, we report the seedbench image accuracy for SPHINX. The results are verified by the authors of Seedbench. You may refer to the SEED-Bench Leaderboard for more information.

Isaachhh commented 5 months ago

I see.

But if you report seedbench image accuracy, the scores of LLaVA are wrong. You should also report image accuracy.

Artanic30 commented 5 months ago

Thanks for pointing out, it's a mistake in our paper. We will correct the results of LLaVA in latest version.