mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
https://mbzuai-oryx.github.io/Video-ChatGPT
Creative Commons Attribution 4.0 International
1.05k stars 92 forks source link

What is the minimum system requirement for training #55

Closed nahidalam closed 9 months ago

nahidalam commented 9 months ago

Hi I see in the training info page (https://github.com/mbzuai-oryx/Video-ChatGPT/blob/main/docs/train_video_chatgpt.md) training with 8 A100 40GB GPUs. Is that the minimum requirement? Will a single GPU training work? Will the training work on any other GPUs such as V100?

nahidalam commented 9 months ago

Looks like you have Flash Attention so V100 will not work.

Will it train on A10 machines then?

mmaaz60 commented 9 months ago

Hi @nahidalam,

Firstly, apologies for the late reply due to some busy weeks. Secondly, our model can be trained on a single A100-40 GB GPU as well. In this case you can set --gradient_accumulation_steps 8 so that overall batch size remains 32.

https://github.com/mbzuai-oryx/Video-ChatGPT/blob/main/docs/train_video_chatgpt.md#train-video-chatgpt

The code should work on A10, however in this case we may have to lower the batch size further because of relatively less GPU memory.