issues
search
AILab-CVC
/
SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
Other
315
stars
12
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
huggingface Leaderboard not working
#31
LindiaC
opened
2 weeks ago
0
cc3m Image type
#30
insafim
closed
1 month ago
1
Data source for "Transit Maps" model: not listed in this repository? also not listed on Huggingface?
#29
miku2000
opened
4 months ago
1
Wrong Question Types in SEED-Bench1
#28
littlepenguin89106
opened
6 months ago
3
Question about evaluation input format
#27
yellow-binary-tree
opened
6 months ago
1
Evaluation Method for Closed-Source Models like GPT4V
#26
JUNJIE99
closed
6 months ago
1
any plan to release the original images for SEED-Bench-2-Plus?
#25
hjeun
opened
6 months ago
1
Question on multi-image input
#24
auhowielau
opened
8 months ago
2
[bugs] LLaVA-Evaluation : line81, '[img]' is supposed to be '<img>'?
#23
LeoWootsi
closed
7 months ago
1
Update eval.py
#22
JJJYmmm
opened
9 months ago
0
[bugs] LLaVA-Evaluation : RuntimeError: Expected all tensors to be on the same device
#21
JJJYmmm
opened
9 months ago
0
Muti-GPUs Evaluation
#20
JJJYmmm
opened
9 months ago
0
Question on how task27 generates images
#19
JunZhan2000
opened
10 months ago
2
fix filter_questions
#18
simonJJJ
closed
11 months ago
0
What is the correct way to download the video
#17
teasgen
closed
11 months ago
6
Request for the removing duplicate results
#16
khanrc
closed
11 months ago
2
a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable?
#15
nemonameless
opened
11 months ago
5
In-Context Example Selection Process
#14
mustafaadogan
opened
11 months ago
1
How to download the images?
#13
dyahadila
closed
1 year ago
0
VLMs vs LLMs evaluation
#12
idan-tankel
opened
1 year ago
1
Easy way to probe result examples?
#11
chancharikmitra
opened
1 year ago
1
Request for the interface of minigpt4 and llava
#10
Richar-Du
opened
1 year ago
2
Reproduce the Qwen-VL SOTAs results
#9
jinze1994
opened
1 year ago
2
Support for evaluation of other VLM models like MiniGPT-4, mPLUG-Owl, Llava, and VPGTrans
#8
WesleyHsieh0806
opened
1 year ago
2
This benchmark could lead to wrong conclusion.
#7
dannyhung1128
opened
1 year ago
3
[Data] Could you provide a list including the files of something-something v2 which should be downloaded?
#6
aopolin-lv
closed
1 year ago
2
Add InstructBlip Flan-T5-xl and InstructBlip Flan-T5-xxl
#5
brianjking
opened
1 year ago
1
Ask for original captions from which you generate questions
#4
YangYangGirl
closed
1 year ago
1
Update for mPLUG-Owl
#3
MAGAer13
opened
1 year ago
1
Evaluating latest version of OpenFlamingo
#2
anas-awadalla
closed
1 year ago
1
Update for Otter-Image-MPT7B and Otter-Video
#1
Luodian
opened
1 year ago
1