dvlab-research / LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
Apache License 2.0
693 stars 43 forks source link

Requirements needed for inferring llama-vid llama-vid-13b-full-224-video-fps-1 #69

Open sykuann opened 6 months ago

sykuann commented 6 months ago

Hi, just checking if there is any knowledge regarding the minimum requirement to infer on a short video. In GPU, CPU and VRAM requirements.

Thanks.

yanwei-li commented 6 months ago

Hi, we do not try the minimum requirement, but we use a single 3090 with 24G GPU memory for implementation.