issues
search
thunlp
/
LLaVA-UHD
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
260
stars
14
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How to evaluate your LLaVA-UHD model?
#26
Gaffey
opened
1 month ago
0
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input xxx to have 3 channels, but got xx channels instead
#25
wnzhyee
opened
1 month ago
2
The Vicuna LLM is not frozen during pretraining
#24
ZJULiHongxin
opened
2 months ago
0
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input[12, 15, 336, 336] to have 3 channels, but got 15 channels instead
#23
mann1
opened
2 months ago
1
Trained Resample with Siglip Got inconvergence loss
#22
lucasjinreal
opened
2 months ago
0
Inconsistent Calculation of Patch Numbers in Image Processing and Encoding
#21
ziyangliu666
opened
2 months ago
1
Can we get the weights? :)
#20
mvsoom
opened
2 months ago
0
About the adaptive size part question
#19
lucasjinreal
opened
2 months ago
0
Loss Convergence and whether ViT is Trained
#18
SuperStacie
opened
2 months ago
0
batch size matters in training
#17
WizardMx
opened
2 months ago
0
Isn't 2048 max lenght would out of context if 6 images put in?
#16
OpenJarvisAI
opened
2 months ago
0
[Question] Proof about Range of Slice Aspect Ratios
#15
JJJYmmm
opened
2 months ago
1
Can I input 2688×672 images?
#14
mvsoom
opened
2 months ago
0
Meaning of '8' and '4'
#13
phellonchen
opened
2 months ago
1
Runtime error
#12
Zhangjy1998
opened
2 months ago
1
[Question] `image_features` not matched to input text
#11
sibosutd
opened
3 months ago
0
position embedding reshape error
#10
GaoXiaoshan
opened
3 months ago
2
Weird pre-train loss
#9
BubvieyKevin
opened
3 months ago
1
a problem configuring the environment:ERROR: llava@ file://
#8
ShuoZhang2003
opened
3 months ago
3
About the slice_logic
#7
power0341
closed
3 months ago
2
Update slice_logic.py
#6
power0341
opened
3 months ago
0
RuntimeError: Given groups=1, weight of size [1024, 3, 14, 14], expected input[16, 9, 336, 336] to have 3 channels, but got 9 channels instead
#5
piantic
opened
3 months ago
12
How do you train your ViTs?
#4
baichuanzhou
closed
3 months ago
3
update vit
#3
yiyexy
closed
2 months ago
1
Weird slice_logic test output
#2
SuperStacie
closed
3 months ago
1
Comparing with LLaVA 1.6 Next
#1
choyakawa
opened
3 months ago
1