DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
BSD 3-Clause "New" or "Revised" License
2.7k stars 243 forks source link

Gradio does not work, stuck on uploading forever. #127

Closed whoishoa closed 9 months ago

whoishoa commented 9 months ago

I installed Video-LLaMA on two machines, one on Google Cloud with 4 A100s and one on my person Ubuntu PC with a 4090.

I downloaded:

git clone https://huggingface.co/DAMO-NLP-SG/Video-LLaMA-2-7B-Finetuned

My config file looks as follows:

  llama_model: "Video-LLaMA-2-7B-Finetuned/llama-2-7b-chat-hf"
  imagebind_ckpt_path: "Video-LLaMA-2-7B-Finetuned/"
  ckpt: 'Video-LLaMA-2-7B-Finetuned/VL_LLaMA_2_7B_Finetuned.pth'   # you can use our pretrained ckpt from https://huggingface.co/DAMO-NLP-SG/Video-LLaMA-2-13B-Pretrained/
  ckpt_2:  'Video-LLaMA-2-7B-Finetuned/AL_LLaMA_2_7B_Finetuned.pth'

On both computers, it doesn't work. There's no logs when I'm uploading the file as well.

I tried multiple different jpeg, png, and mp4 files as well.

whoishoa commented 9 months ago

Solved with

pip install gradio==3.37.0