thunlp LLaVA-UHD issues

thunlp / LLaVA-UHD

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

260 stars 14 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

How to evaluate your LLaVA-UHD model?

#26 Gaffey opened 1 month ago
0
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input xxx to have 3 channels, but got xx channels instead

#25 wnzhyee opened 1 month ago
2
The Vicuna LLM is not frozen during pretraining

#24 ZJULiHongxin opened 2 months ago
0
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input[12, 15, 336, 336] to have 3 channels, but got 15 channels instead

#23 mann1 opened 2 months ago
1
Trained Resample with Siglip Got inconvergence loss

#22 lucasjinreal opened 2 months ago
0
Inconsistent Calculation of Patch Numbers in Image Processing and Encoding

#21 ziyangliu666 opened 2 months ago
1
Can we get the weights? :)

#20 mvsoom opened 2 months ago
0
About the adaptive size part question

#19 lucasjinreal opened 2 months ago
0
Loss Convergence and whether ViT is Trained

#18 SuperStacie opened 2 months ago
0
batch size matters in training

#17 WizardMx opened 2 months ago
0
Isn't 2048 max lenght would out of context if 6 images put in?

#16 OpenJarvisAI opened 2 months ago
0
[Question] Proof about Range of Slice Aspect Ratios

#15 JJJYmmm opened 2 months ago
1
Can I input 2688×672 images?

#14 mvsoom opened 2 months ago
0
Meaning of '8' and '4'

#13 phellonchen opened 2 months ago
1
Runtime error

#12 Zhangjy1998 opened 2 months ago
1
[Question] `image_features` not matched to input text

#11 sibosutd opened 3 months ago
0
position embedding reshape error

#10 GaoXiaoshan opened 3 months ago
2
Weird pre-train loss

#9 BubvieyKevin opened 3 months ago
1
a problem configuring the environment：ERROR: llava@ file://

#8 ShuoZhang2003 opened 3 months ago
3
About the slice_logic

#7 power0341 closed 3 months ago
2
Update slice_logic.py

#6 power0341 opened 3 months ago
0
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input[16, 9, 336, 336] to have 3 channels, but got 9 channels instead

#5 piantic opened 3 months ago
12
How do you train your ViTs?

#4 baichuanzhou closed 3 months ago
3
update vit

#3 yiyexy closed 2 months ago
1
Weird slice_logic test output

#2 SuperStacie closed 3 months ago
1
Comparing with LLaVA 1.6 Next

#1 choyakawa opened 3 months ago
1