DAMO-NLP-SG VideoLLaMA2 issues

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Apache License 2.0

752 stars 50 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

When will the audio branch be released?

#99 XuecWu opened 3 days ago
0
⭐ [Feat] Supporting audio and audio-visual stages.

#98 xinyifei99 opened 4 days ago
0
code for batch inference

#97 zhangjic22 opened 5 days ago
0
videollama2_av

#96 xinyifei99 closed 4 days ago
0
Problem: Segmentation fault (core dumped)

#95 CamellIyquitous opened 1 week ago
1
Can I run VideoLLaMA 1 in this repo?

#94 jun297 opened 1 week ago
0
Can videollama2 continue finetuning on my own dataset using 32 frames?

#93 zhengrongz closed 1 week ago
2
VideoLLaMA2 performance gap on video benchmarks

#92 zhuqiangLu opened 1 week ago
0
Videochatgpt_gen link for Test_Human_Annotated_Captions is not valid

#91 jun297 opened 2 weeks ago
0
ValueError: The following `model_kwargs` are not used by the model: ['images_or_videos', 'modal_list']

#90 CaffeyChen opened 2 weeks ago
0
After fine-tuning, the model outputs repetitive phrases

#89 Jackyzjz opened 2 weeks ago
2
🔧 [Refactor] Make codebase reproducible for previous version.

#88 clownrat6 closed 2 weeks ago
0
Could you please advise when the checkpoint for the audio branch will be made public?

#87 ymxyll opened 3 weeks ago
3
Deployment on huggingface endpoints

#86 aliayub40995 opened 3 weeks ago
1
What is the difference between the 'base' and 'chat' versions of a model type?

#85 Lanbai-eleven closed 3 weeks ago
2
Can we do the only text, image and text and video and text finetuning with lora in a one run

#84 thisurawz1 closed 2 weeks ago
1
how to do the inference with the finetune weights / model

#83 thisurawz1 opened 1 month ago
4
UnboundLocalError: local variable "video_path" referenced before assigment

#82 acDante opened 1 month ago
3
Cannot reproduce results on vllava datasets

#81 williamium3000 opened 1 month ago
20
Model keeps output "there is no sound/ I can not hear anything" when there is actual sound

#80 qixueweigitbub opened 1 month ago
1
train and fine tune for audio-video

#79 trahman8 opened 1 month ago
1
Unable to load *ANY BASE MODEL* in 4bit

#78 ApoorvFrontera opened 1 month ago
1
Error while loading Mixtral based SFT MoE model VideoLLaMA2-8x7B: SafetensorError: Error while deserializing header: InvalidHeaderDeserialization

#77 ApoorvFrontera opened 1 month ago
0
Weird "Invalid base64-encoded" Error

#76 lucasxu777 opened 1 month ago
0
Missing comparsion

#75 LiquidAmmonia opened 1 month ago
0
Problem about processor in load_pretrained_model

#74 ShuyUSTC opened 1 month ago
3
Can I use a WAV file as input for inference?

#73 FanBu02 opened 1 month ago
0
🔧 [v1.5] Bump to v1.5 codebase.

#72 clownrat6 closed 1 month ago
0
Error while loading custom finetuned QLoRA model in 4 bit : size mismatch for model.mm_projector.readout.0.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).

#71 ApoorvFrontera opened 1 month ago
2
Maybe a bug on data preprocess

#70 Weili-NLP closed 1 month ago
2
Audio branch

#69 Morgott-The-Omen-King opened 1 month ago
1
Minor typo in arxiv paper

#68 QAQdev closed 1 month ago
1
how to define target modules in the Qlora script

#67 thisurawz1 closed 4 weeks ago
2
Segmentation fault with the provided inference code

#66 yankee624 opened 1 month ago
3
How to run finetuned model with gradio

#65 thisurawz1 opened 1 month ago
1
Long video error.

#64 seTalent opened 1 month ago
2
Release about Audio branch

#63 XuecWu closed 1 month ago
2
Bug exists during installation.

#62 CasperFang closed 1 month ago
1
When will the weights of MoE based on Mixtral 8x7B model released?

#61 ApoorvFrontera closed 1 month ago
2
Recommend some configurations

#60 flinzhao opened 2 months ago
0
How to run the finetuned model with LoRA adapters.

#59 thisurawz1 opened 2 months ago
6
how to finetune Videollama2 chat models using QLoRA and LoRA.

#58 thisurawz1 opened 2 months ago
0
Fine tuning using "finetune_lora.sh" file

#57 lucasxu777 opened 2 months ago
0
Webvid-10M (40% sampling)

#56 VJatla opened 2 months ago
0
🔧 [Fix] Standardizing conversation template name.

#55 clownrat6 closed 2 months ago
0
Doubt regarding training data

#54 pritamqu closed 2 months ago
3
Datasets download for MSVC

#53 1xbq1 closed 2 months ago
2
Unable to install

#52 psergiu21 opened 2 months ago
0
How to finetune on the "VideoLLaMA2-7B" instead of "VideoLLaMA2-7B-Base"?

#51 Zeqing-Wang closed 2 months ago
3
How to eval egoschema and perception_test？

#50 caichaoxiang closed 2 months ago
1