Closed FanBu02 closed 1 month ago
Thanks for your attention! You can switch to the audio_visual branch (https://github.com/DAMO-NLP-SG/VideoLLaMA2/tree/audio_visual) and clone the repository to run inference for audio related tasks.
Can I use a WAV file as input for inference? Could you tell me roughly how to modify the code?
Hello,I'm a phD student from ZJU, I also use videollama2 to do my own research,we create a WeChat group to discuss some issues of videollama2 and help each other,could you join us? Please contact me: WeChat number == LiangMeng19357260600, phone number == +86 19357260600,e-mail == liangmeng89@zju.edu.cn.
Can I use a WAV file as input for inference? Could you tell me roughly how to modify the code?