DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
BSD 3-Clause "New" or "Revised" License
2.77k stars 255 forks source link

Compatibility b/w torch and torchvision? #152

Open shreyakannan1205 opened 6 months ago

shreyakannan1205 commented 6 months ago

Hello,

I am trying to run the Video-Llama demo file and I am setting up the environment on my linux machine.

These are the steps I am doing: 1) Cloning video-llama repo 2) Within that, cloning the 7B-Finetuned model repo 3) Setting up conda, creating an environment with environment.yaml 4) installing ffmpeg (I am doing conda install ffmpeg) because my VM doesn't allow apt install or pip install without the conda environment. 5) Activating the env

When I try to do import torchvision, I am having this error: miniconda3/envs/videollama/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev

My torchvision version is 0.13.1 and torch version is 2.2.2. Are these compatible? Or is there something else I am missing?

It would be really helpful if someone could provide me tips on how I could fix this!

Thanks!