issues
search
PKU-YuanGroup
/
LanguageBind
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
https://arxiv.org/abs/2310.01852
MIT License
549
stars
44
forks
source link
issues
Least commented
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add flash attention 2
#19
pphuc25
closed
5 months ago
7
Inconsistent running results of inference.py
#45
Jade999
closed
2 months ago
5
Clarification questions about the framework
#50
felmoreno1726
opened
4 weeks ago
4
What is the training configurations for full tuning?
#32
StanLei52
closed
4 months ago
4
Hashtags and prompts?
#21
Kamino666
closed
5 months ago
4
Combination of multiple modalities
#38
anthony-mendil
opened
3 months ago
3
When will you release the dataset?
#9
xiangchen-Z
closed
6 months ago
3
Non-reproducible MSRVTT results - I get R@1 accuracy less than 1%
#51
lennartmoritz
opened
3 weeks ago
2
where is LanguageBind_Image
#46
hd201708010401
opened
1 month ago
2
Can you share the NYU-D dataset you used for evaluation, e.g. how to split the dataset?
#29
bf-yang
closed
4 months ago
2
What's the difference between LanguageBind and LLaVA-1.5
#26
OPilgrim
closed
4 months ago
2
VIT-H model release
#22
tikboaHIT
closed
5 months ago
2
provide a sample data for training
#14
pphuc25
closed
5 months ago
2
Choice of Vit-L over Vit-H
#10
jacklishufan
closed
6 months ago
2
Text input length
#5
zhaoshitian
closed
7 months ago
2
GPU sources
#4
xiaoaoran
closed
7 months ago
2
Seeing excessive GPU memory usage during inference
#3
abhimanyu891998
closed
7 months ago
2
Difference from imagebind
#2
lzw-lzw
closed
7 months ago
2
How to load pt model trained according to Training LanguageBind step?
#48
haochange
opened
1 month ago
1
gpu资源
#47
letaozhang
opened
1 month ago
1
Fine-tuneing LLM + LanguageBind?
#42
Crystalxd
opened
2 months ago
1
The length of text that the text encoder can handle
#40
song-wensong
opened
2 months ago
1
VIT-H model on other modality [Audio/Depth/Thermal]
#39
tikboaHIT
opened
3 months ago
1
Use of undefined functions during fine_tune with custom audio data
#37
okaybody10
closed
3 months ago
1
Audio-Language Alignment data for reproduction
#36
memoiry
opened
3 months ago
1
Vision encoder version
#34
JosephPai
closed
3 months ago
1
Congrats on Acceptance !!!
#33
SenmiaoORZ
opened
4 months ago
1
how to load LanguageBind/LanguageBind_Video_Huge_V1.5_FT model
#30
valencebond
closed
4 months ago
1
Why don't to share the parameters backbone between Image and Video?
#28
SCZwangxiao
closed
4 months ago
1
视频特征的提取支持动态帧数吗,效果相对于8帧会有下降或者变差吗
#27
1093842024
closed
4 months ago
1
How to Initialize the multi-modal encoders & training from scratch
#25
chen-yy20
closed
5 months ago
1
where is the LanguageBind_Audio_FT in huggingface?
#24
kou35
closed
5 months ago
1
about LanguageBind_Video_merge
#23
kou35
closed
5 months ago
1
用于特征提取对齐,选用输出为什么参数
#20
huainanchen
closed
5 months ago
1
Can I change embeddings['image'].shape from 768 to 1024?
#18
dongfeicui
closed
5 months ago
1
About download weights
#17
dongfeicui
closed
5 months ago
1
cannot run the code train
#13
pphuc25
closed
6 months ago
1
pretraining details
#12
xiaoen0
closed
6 months ago
1
how to use hugging face model
#11
carry-xz
closed
6 months ago
1
research about a model video captioning
#8
pphuc25
closed
6 months ago
1
docs: add if cuda available
#7
pphuc25
closed
7 months ago
1
bug in install requirements.txt
#6
pphuc25
closed
7 months ago
1
Update README.md
#1
eltociear
closed
7 months ago
1
Any plans to use Long-CLIP to extend text input token limit?
#53
lennartmoritz
opened
4 days ago
0
NameError: name 'get_audio_anno' is not defined
#52
noah003
opened
2 weeks ago
0
关于视频文本的训练问题
#49
Tunanzzz
closed
1 month ago
0
confusion about VIDAL-10M video-text data
#44
wli333
opened
2 months ago
0
Create depth_ddp_glpn.py
#43
BinZhu-ece
opened
2 months ago
0
Inquiry on Unimodal Fine-Tuning with Locked Image in LanguageBind
#41
hexinyi2101
closed
4 days ago
0
finetuning on a classification task
#35
Sravanthgithub
opened
3 months ago
0
Next