AILab-CVC / SEED-Bench

(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
Other
310 stars 11 forks source link

Add InstructBlip Flan-T5-xl and InstructBlip Flan-T5-xxl #5

Open brianjking opened 1 year ago

brianjking commented 1 year ago

Thank you for this amazing eval work, really crucial to move this space forward.

In addition to the InstructBlip Vicuna version Salesforce also trained versions on Blip2 + Flan-T5xl and Flan-T5xxl. I would love to see how these perform against the testbench you've developed in SEED-Bench.

geyuying commented 1 year ago

Thank you for your attention to our SEED-Bench.

We have released SEED-Bench leaderboard in https://huggingface.co/spaces/AILab-CVC/SEED-Bench_Leaderboard and you can update the results of your models in the leaderboard by following our evaluation instructions.