DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Apache License 2.0
752 stars 50 forks source link

Audio branch #69

Open Morgott-The-Omen-King opened 1 month ago

Morgott-The-Omen-King commented 1 month ago

Hello, authors,

When can we get the Audio-Visual finetuned Video-LLaMA2?Or can we finetune this by ourselves based on the well-visual-finetuned video-llama2?

Thanks in advanced.

mfarre commented 1 month ago

I also have the same question: I would like to use videollama2 training code to evaluate some datasets and having the audio part would be very interesting. Thank you!