issues
search
DAMO-NLP-SG
/
VideoLLaMA2
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Apache License 2.0
752
stars
50
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
When will the audio branch be released?
#99
XuecWu
opened
3 days ago
0
⭐ [Feat] Supporting audio and audio-visual stages.
#98
xinyifei99
opened
4 days ago
0
code for batch inference
#97
zhangjic22
opened
5 days ago
0
videollama2_av
#96
xinyifei99
closed
4 days ago
0
Problem: Segmentation fault (core dumped)
#95
CamellIyquitous
opened
1 week ago
1
Can I run VideoLLaMA 1 in this repo?
#94
jun297
opened
1 week ago
0
Can videollama2 continue finetuning on my own dataset using 32 frames?
#93
zhengrongz
closed
1 week ago
2
VideoLLaMA2 performance gap on video benchmarks
#92
zhuqiangLu
opened
1 week ago
0
Videochatgpt_gen link for Test_Human_Annotated_Captions is not valid
#91
jun297
opened
2 weeks ago
0
ValueError: The following `model_kwargs` are not used by the model: ['images_or_videos', 'modal_list']
#90
CaffeyChen
opened
2 weeks ago
0
After fine-tuning, the model outputs repetitive phrases
#89
Jackyzjz
opened
2 weeks ago
2
🔧 [Refactor] Make codebase reproducible for previous version.
#88
clownrat6
closed
2 weeks ago
0
Could you please advise when the checkpoint for the audio branch will be made public?
#87
ymxyll
opened
3 weeks ago
3
Deployment on huggingface endpoints
#86
aliayub40995
opened
3 weeks ago
1
What is the difference between the 'base' and 'chat' versions of a model type?
#85
Lanbai-eleven
closed
3 weeks ago
2
Can we do the only text, image and text and video and text finetuning with lora in a one run
#84
thisurawz1
closed
2 weeks ago
1
how to do the inference with the finetune weights / model
#83
thisurawz1
opened
1 month ago
4
UnboundLocalError: local variable "video_path" referenced before assigment
#82
acDante
opened
1 month ago
3
Cannot reproduce results on vllava datasets
#81
williamium3000
opened
1 month ago
20
Model keeps output "there is no sound/ I can not hear anything" when there is actual sound
#80
qixueweigitbub
opened
1 month ago
1
train and fine tune for audio-video
#79
trahman8
opened
1 month ago
1
Unable to load *ANY BASE MODEL* in 4bit
#78
ApoorvFrontera
opened
1 month ago
1
Error while loading Mixtral based SFT MoE model VideoLLaMA2-8x7B: SafetensorError: Error while deserializing header: InvalidHeaderDeserialization
#77
ApoorvFrontera
opened
1 month ago
0
Weird "Invalid base64-encoded" Error
#76
lucasxu777
opened
1 month ago
0
Missing comparsion
#75
LiquidAmmonia
opened
1 month ago
0
Problem about processor in load_pretrained_model
#74
ShuyUSTC
opened
1 month ago
3
Can I use a WAV file as input for inference?
#73
FanBu02
opened
1 month ago
0
🔧 [v1.5] Bump to v1.5 codebase.
#72
clownrat6
closed
1 month ago
0
Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).
#71
ApoorvFrontera
opened
1 month ago
2
Maybe a bug on data preprocess
#70
Weili-NLP
closed
1 month ago
2
Audio branch
#69
Morgott-The-Omen-King
opened
1 month ago
1
Minor typo in arxiv paper
#68
QAQdev
closed
1 month ago
1
how to define target modules in the Qlora script
#67
thisurawz1
closed
4 weeks ago
2
Segmentation fault with the provided inference code
#66
yankee624
opened
1 month ago
3
How to run finetuned model with gradio
#65
thisurawz1
opened
1 month ago
1
Long video error.
#64
seTalent
opened
1 month ago
2
Release about Audio branch
#63
XuecWu
closed
1 month ago
2
Bug exists during installation.
#62
CasperFang
closed
1 month ago
1
When will the weights of MoE based on Mixtral 8x7B model released?
#61
ApoorvFrontera
closed
1 month ago
2
Recommend some configurations
#60
flinzhao
opened
2 months ago
0
How to run the finetuned model with LoRA adapters.
#59
thisurawz1
opened
2 months ago
6
how to finetune Videollama2 chat models using QLoRA and LoRA.
#58
thisurawz1
opened
2 months ago
0
Fine tuning using "finetune_lora.sh" file
#57
lucasxu777
opened
2 months ago
0
Webvid-10M (40% sampling)
#56
VJatla
opened
2 months ago
0
🔧 [Fix] Standardizing conversation template name.
#55
clownrat6
closed
2 months ago
0
Doubt regarding training data
#54
pritamqu
closed
2 months ago
3
Datasets download for MSVC
#53
1xbq1
closed
2 months ago
2
Unable to install
#52
psergiu21
opened
2 months ago
0
How to finetune on the "VideoLLaMA2-7B" instead of "VideoLLaMA2-7B-Base"?
#51
Zeqing-Wang
closed
2 months ago
3
How to eval egoschema and perception_test?
#50
caichaoxiang
closed
2 months ago
1
Next