issues
search
iburenko
/
multimodal-reading-group
2
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Paper Suggestion] PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning
#12
iburenko
closed
3 months ago
2
Two papers on Multi-modal Chain-of-Thought
#11
iburenko
closed
3 months ago
1
[Paper Suggestion] GLaMM: Pixel Grounding Large Multimodal Model
#10
lbuess
closed
5 months ago
1
[Paper Suggestion] Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
#9
iburenko
opened
5 months ago
0
[Paper Suggestion] Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
#7
iburenko
opened
5 months ago
0
[Paper Suggestion] Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think!
#6
iburenko
closed
3 months ago
1
[Paper Suggestion] Woodpecker: Hallucination Correction for Multimodal Large Language Models
#5
iburenko
opened
5 months ago
0
[Paper Suggestion] Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
#4
Krotonus
closed
6 months ago
1
[Paper Suggestion] Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies
#3
Krotonus
closed
7 months ago
1
[Paper Suggestion] Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
#2
lbuess
closed
7 months ago
4
[Paper Suggestion] Sigmoid Loss for Language Image Pre-Training
#1
Lalith-Manjunath
closed
8 months ago
1