AILab-CVC SEED-Bench issues

AILab-CVC / SEED-Bench

(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.

Other

315 stars 12 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

huggingface Leaderboard not working

#31 LindiaC opened 2 weeks ago
0
cc3m Image type

#30 insafim closed 1 month ago
1
Data source for "Transit Maps" model: not listed in this repository? also not listed on Huggingface?

#29 miku2000 opened 4 months ago
1
Wrong Question Types in SEED-Bench1

#28 littlepenguin89106 opened 6 months ago
3
Question about evaluation input format

#27 yellow-binary-tree opened 6 months ago
1
Evaluation Method for Closed-Source Models like GPT4V

#26 JUNJIE99 closed 6 months ago
1
any plan to release the original images for SEED-Bench-2-Plus?

#25 hjeun opened 6 months ago
1
Question on multi-image input

#24 auhowielau opened 8 months ago
2
[bugs] LLaVA-Evaluation : line81, '[img]' is supposed to be '<img>'?

#23 LeoWootsi closed 7 months ago
1
Update eval.py

#22 JJJYmmm opened 9 months ago
0
[bugs] LLaVA-Evaluation : RuntimeError: Expected all tensors to be on the same device

#21 JJJYmmm opened 9 months ago
0
Muti-GPUs Evaluation

#20 JJJYmmm opened 9 months ago
0
Question on how task27 generates images

#19 JunZhan2000 opened 10 months ago
2
fix filter_questions

#18 simonJJJ closed 11 months ago
0
What is the correct way to download the video

#17 teasgen closed 11 months ago
6
Request for the removing duplicate results

#16 khanrc closed 11 months ago
2
a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable?

#15 nemonameless opened 11 months ago
5
In-Context Example Selection Process

#14 mustafaadogan opened 11 months ago
1
How to download the images?

#13 dyahadila closed 1 year ago
0
VLMs vs LLMs evaluation

#12 idan-tankel opened 1 year ago
1
Easy way to probe result examples?

#11 chancharikmitra opened 1 year ago
1
Request for the interface of minigpt4 and llava

#10 Richar-Du opened 1 year ago
2
Reproduce the Qwen-VL SOTAs results

#9 jinze1994 opened 1 year ago
2
Support for evaluation of other VLM models like MiniGPT-4, mPLUG-Owl, Llava, and VPGTrans

#8 WesleyHsieh0806 opened 1 year ago
2
This benchmark could lead to wrong conclusion.

#7 dannyhung1128 opened 1 year ago
3
[Data] Could you provide a list including the files of something-something v2 which should be downloaded?

#6 aopolin-lv closed 1 year ago
2
Add InstructBlip Flan-T5-xl and InstructBlip Flan-T5-xxl

#5 brianjking opened 1 year ago
1
Ask for original captions from which you generate questions

#4 YangYangGirl closed 1 year ago
1
Update for mPLUG-Owl

#3 MAGAer13 opened 1 year ago
1
Evaluating latest version of OpenFlamingo

#2 anas-awadalla closed 1 year ago
1
Update for Otter-Image-MPT7B and Otter-Video

#1 Luodian opened 1 year ago
1