iburenko multimodal-reading-group issues - Githubissues

iburenko / multimodal-reading-group

2 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Paper Suggestion] PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning

#12 iburenko closed 3 months ago
2
Two papers on Multi-modal Chain-of-Thought

#11 iburenko closed 3 months ago
1
[Paper Suggestion] GLaMM: Pixel Grounding Large Multimodal Model

#10 lbuess closed 5 months ago
1
[Paper Suggestion] Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

#9 iburenko opened 5 months ago
0
[Paper Suggestion] Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond

#7 iburenko opened 5 months ago
0
[Paper Suggestion] Does my multimodal model learn cross-modal interactions? It’s harder to tell than you might think!

#6 iburenko closed 3 months ago
1
[Paper Suggestion] Woodpecker: Hallucination Correction for Multimodal Large Language Models

#5 iburenko opened 5 months ago
0
[Paper Suggestion] Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models

#4 Krotonus closed 6 months ago
1
[Paper Suggestion] Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies

#3 Krotonus closed 7 months ago
1
[Paper Suggestion] Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs

#2 lbuess closed 7 months ago
4
[Paper Suggestion] Sigmoid Loss for Language Image Pre-Training

#1 Lalith-Manjunath closed 8 months ago
1