DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
BSD 3-Clause "New" or "Revised" License
2.77k stars 255 forks source link

CPU support #82

Open esteininger opened 1 year ago

esteininger commented 1 year ago

swapped out CUDA params for CPU support

james-hu commented 9 months ago

Hi @esteininger, I am new to this repo, so maybe this is a naive question, does it work with CPU? I mean, with this patch, is it able to generate a response in a reasonable time? And does it work for both training and inferencing?

BTW, I don't think the owner of this repo would accept this kind of functionality change. If it works with CPU, maybe making the behaviour controlled by an optional parameter specified somewhere (command line, or environment variable, or configuraiton file) is a better approach. Just my two cents.