RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
https://arxiv.org/abs/2312.02051
BSD 3-Clause "New" or "Revised" License
267 stars 23 forks source link

RAM and VRAM requirement #13

Closed Coronal-Halo closed 5 months ago

Coronal-Halo commented 6 months ago

Is there any way that this program can run with 16 or 24 GB of VRAM?

RenShuhuai-Andy commented 6 months ago

Hi, thanks for your interest.

You can try installing the bitsandbytes library and add model_config.low_resource = True in https://github.com/RenShuhuai-Andy/TimeChat/blob/master/demo.ipynb. This will load the LLAMA model in 8-bit precision, potentially enabling the model to run with 24 GB of VRAM, see https://github.com/RenShuhuai-Andy/TimeChat/blob/master/timechat/models/timechat.py#L139. However, I cannot guarantee the performance of the 8-bit LLAMA model.

onlyonewater commented 5 months ago

I set the low_resource as false, and the memory of the GPU is only 19G, so I think a 24G GPU like 3090 is enough for inference. If you set the low_resource as True, the memory of the GPU is about 15G, which is suitable for a 16G GPU such as 4060Ti. @Coronal-Halo and @RenShuhuai-Andy