Closed nahidalam closed 1 year ago
Looks like you have Flash Attention so V100 will not work.
Will it train on A10 machines then?
Hi @nahidalam,
Firstly, apologies for the late reply due to some busy weeks. Secondly, our model can be trained on a single A100-40 GB GPU as well. In this case you can set --gradient_accumulation_steps 8
so that overall batch size remains 32
.
The code should work on A10, however in this case we may have to lower the batch size further because of relatively less GPU memory.
Hi I see in the training info page (https://github.com/mbzuai-oryx/Video-ChatGPT/blob/main/docs/train_video_chatgpt.md) training with 8 A100 40GB GPUs. Is that the minimum requirement? Will a single GPU training work? Will the training work on any other GPUs such as V100?