Closed hkuit closed 1 year ago
Hi @hkuit,
Yes, you can train on a single A100-80G GPU. Please make sure to keep the overall batch size to 32. This can be achieved by using the following setting,
--per_device_train_batch_size 8 \
--gradient_accumulation_steps 4 \
Please let me know if it works. Thanks
Thank you for the reply @mmaaz60 , let me try it and update to you later.
Thanks, it works.
Hello, thanks for the great work.
Can I train the model using only one A100 80G GPU? Or how can we modify the code so that it can be trained on one gpu? Thank you so much.