PKU-YuanGroup LanguageBind issues

PKU-YuanGroup / LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

https://arxiv.org/abs/2310.01852

MIT License

549 stars 44 forks source link

issues

Least commented

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add flash attention 2

#19 pphuc25 closed 5 months ago
7
Inconsistent running results of inference.py

#45 Jade999 closed 2 months ago
5
Clarification questions about the framework

#50 felmoreno1726 opened 4 weeks ago
4
What is the training configurations for full tuning?

#32 StanLei52 closed 4 months ago
4
Hashtags and prompts?

#21 Kamino666 closed 5 months ago
4
Combination of multiple modalities

#38 anthony-mendil opened 3 months ago
3
When will you release the dataset?

#9 xiangchen-Z closed 6 months ago
3
Non-reproducible MSRVTT results - I get R@1 accuracy less than 1%

#51 lennartmoritz opened 3 weeks ago
2
where is LanguageBind_Image

#46 hd201708010401 opened 1 month ago
2
Can you share the NYU-D dataset you used for evaluation, e.g. how to split the dataset?

#29 bf-yang closed 4 months ago
2
What's the difference between LanguageBind and LLaVA-1.5

#26 OPilgrim closed 4 months ago
2
VIT-H model release

#22 tikboaHIT closed 5 months ago
2
provide a sample data for training

#14 pphuc25 closed 5 months ago
2
Choice of Vit-L over Vit-H

#10 jacklishufan closed 6 months ago
2
Text input length

#5 zhaoshitian closed 7 months ago
2
GPU sources

#4 xiaoaoran closed 7 months ago
2
Seeing excessive GPU memory usage during inference

#3 abhimanyu891998 closed 7 months ago
2
Difference from imagebind

#2 lzw-lzw closed 7 months ago
2
How to load pt model trained according to Training LanguageBind step?

#48 haochange opened 1 month ago
1
gpu资源

#47 letaozhang opened 1 month ago
1
Fine-tuneing LLM + LanguageBind?

#42 Crystalxd opened 2 months ago
1
The length of text that the text encoder can handle

#40 song-wensong opened 2 months ago
1
VIT-H model on other modality [Audio/Depth/Thermal]

#39 tikboaHIT opened 3 months ago
1
Use of undefined functions during fine_tune with custom audio data

#37 okaybody10 closed 3 months ago
1
Audio-Language Alignment data for reproduction

#36 memoiry opened 3 months ago
1
Vision encoder version

#34 JosephPai closed 3 months ago
1
Congrats on Acceptance !!!

#33 SenmiaoORZ opened 4 months ago
1
how to load LanguageBind/LanguageBind_Video_Huge_V1.5_FT model

#30 valencebond closed 4 months ago
1
Why don't to share the parameters backbone between Image and Video?

#28 SCZwangxiao closed 4 months ago
1
视频特征的提取支持动态帧数吗，效果相对于8帧会有下降或者变差吗

#27 1093842024 closed 4 months ago
1
How to Initialize the multi-modal encoders & training from scratch

#25 chen-yy20 closed 5 months ago
1
where is the LanguageBind_Audio_FT in huggingface?

#24 kou35 closed 5 months ago
1
about LanguageBind_Video_merge

#23 kou35 closed 5 months ago
1
用于特征提取对齐，选用输出为什么参数

#20 huainanchen closed 5 months ago
1
Can I change embeddings['image'].shape from 768 to 1024?

#18 dongfeicui closed 5 months ago
1
About download weights

#17 dongfeicui closed 5 months ago
1
cannot run the code train

#13 pphuc25 closed 6 months ago
1
pretraining details

#12 xiaoen0 closed 6 months ago
1
how to use hugging face model

#11 carry-xz closed 6 months ago
1
research about a model video captioning

#8 pphuc25 closed 6 months ago
1
docs: add if cuda available

#7 pphuc25 closed 7 months ago
1
bug in install requirements.txt

#6 pphuc25 closed 7 months ago
1
Update README.md

#1 eltociear closed 7 months ago
1
Any plans to use Long-CLIP to extend text input token limit?

#53 lennartmoritz opened 4 days ago
0
NameError: name 'get_audio_anno' is not defined

#52 noah003 opened 2 weeks ago
0
关于视频文本的训练问题

#49 Tunanzzz closed 1 month ago
0
confusion about VIDAL-10M video-text data

#44 wli333 opened 2 months ago
0
Create depth_ddp_glpn.py

#43 BinZhu-ece opened 2 months ago
0
Inquiry on Unimodal Fine-Tuning with Locked Image in LanguageBind

#41 hexinyi2101 closed 4 days ago
0
finetuning on a classification task

#35 Sravanthgithub opened 3 months ago
0