DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
BSD 3-Clause "New" or "Revised" License
2.83k stars 263 forks source link

Error loading the audio #163

Open xjr01 opened 6 months ago

xjr01 commented 6 months ago

There seems to be a bug in the function upload_video() in the class Chat in file video_llama/conversation/conversation_video.py. On the 255 line of conversation_video.py, you directly pass the video_path to the function load_and_transform_audio_data(), which does not support video format. This would cause an exception in load_and_transform_audio_data(), resulting in skipping the audio loading while printing the information no audio is found.