DAMO-NLP-SG Video-LLaMA issues

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

BSD 3-Clause "New" or "Revised" License

2.83k stars 263 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

RuntimeError: Internal: could not parse ModelProto from ../Video-LLaMA-2-7B-Finetuned/llama-2-7b-chat-hf/tokenizer.model

#175 hyun95roh opened 2 weeks ago
0
README and documentation translated into Turkish

#174 mock3ng opened 3 weeks ago
0
Can you Fix the DEMO. Demo is no longer working

#173 thisurawz1 opened 1 month ago
0
配置文件位置在本地但是还是提示OSError: Can't load tokenizer for 'bert-base-uncased'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bert-base-uncased' is the correct path to a directory containing all relevant files for a BertTokenizer tokenizer.

#172 Asmallsoldier opened 2 months ago
0
Do you have plan to release Video-LLaMA checkpoints with LLaMA 3.1?

#171 ShramanPramanick opened 3 months ago
1
Issue in api endpoints

#170 RAJA102002 opened 4 months ago
0
模型错误输出结果

#169 shiyeeee closed 4 months ago
0
Audio input

#168 CHEN-H01 opened 5 months ago
0
训练时长？

#167 riariam opened 5 months ago
0
modelling_llama.py

#166 zeroQiaoba opened 5 months ago
1
Problem running demo: Loading checkpoint shards never finishes

#165 jpssoares opened 5 months ago
1
Fix error on loading audio of the input video, as described in issue #163.

#164 xjr01 opened 6 months ago
0
Error loading the audio

#163 xjr01 opened 6 months ago
0
Finetune with LoRA and QLoRA

#162 thisurawz1 opened 6 months ago
0
finetune-billa7b-zh inference error shape '[-1, 136]' is invalid for input of size 137

#161 len2618187 opened 6 months ago
0
llm在两个阶段都是keep frozen吗？

#160 Nastu-Ho opened 6 months ago
1
Dc more tweaks

#159 dcnoye closed 6 months ago
0
What if no frame_position_embeddings?

#158 LetsGoFir opened 6 months ago
0
ignore downloaded models

#157 dcnoye closed 7 months ago
0
Dc dockerize initial

#156 dcnoye closed 7 months ago
0
how to increase the numbers of input frame?

#155 onlyonewater opened 7 months ago
2
Possible bugs in LR scheduler

#154 SAGNIKMJR opened 7 months ago
0
.

#153 advenTure423 closed 7 months ago
0
Compatibility b/w torch and torchvision?

#152 shreyakannan1205 opened 7 months ago
0
Evaluation on large-scale dataset

#151 hritam-98 opened 8 months ago
1
Is video-LLaMA capable of comprehending videos that have faces surrounded by bounding boxes(face recognition)

#150 PhilipAmadasun opened 8 months ago
0
Unable to launch demo

#149 joysl opened 8 months ago
2
How To: Use hugging face checkpoints downloaded on a CentOS machine

#148 joysl closed 8 months ago
5
如何提升下游任务上finetune的效果

#147 Jinjikiko opened 8 months ago
0
What is the input sample of the forward function in videollama

#146 llx-08 opened 9 months ago
1
Incorrect model inference (what went wrong in my setup)

#145 jennyziyi-xu opened 9 months ago
0
How to select the video encoder of the chinese version with BiLLA or Ziya ?

#144 cm-xcju closed 9 months ago
2
Hugging Face demo runtime error

#143 sihoseanhan closed 9 months ago
2
Frame-aware?

#142 jayavanth closed 8 months ago
1
multi-cards training

#141 gqsmmz opened 10 months ago
0
A demo without gradio

#140 liboliba opened 10 months ago
1
example model deployment

#139 nahidalam opened 10 months ago
0
inf value occurs during forwarding process when fine-tuning VL branch with LLAVA-150K+MiniGPT4-3.5K+webvid-instruct

#138 xuboshen opened 11 months ago
1
Unable to access LLaMA weights to build Vicuna-7B

#137 muzairkhattak closed 11 months ago
1
Dear author, How much time does it cost to train this model？ With what type of GPU cards?

#136 zhangyuereal opened 11 months ago
0
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM: size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]). size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

#135 Amber0913 opened 11 months ago
0
Very poor audio understanding

#134 DumplingLife closed 11 months ago
1
How to finetune video-llama using deepspeed?

#133 tangyipeng100 opened 11 months ago
0
Prompt

#132 tobyperrett opened 12 months ago
0
Hugging Face Spaces not working!

#131 simmimak closed 12 months ago
1
The question about llama parameters during pre-training and fine-tuning.

#130 cooper12121 closed 12 months ago
2
Multiple Video-Text pair Support

#129 mustafaadogan opened 1 year ago
1
change the frames and query_tokens size

#128 AllenFind opened 1 year ago
0
Gradio does not work, stuck on uploading forever.

#127 whoishoa closed 1 year ago
1
Interesting prompt template

#126 tian1327 closed 1 year ago
1