fawazsammani / nlxgpt

NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)
44 stars 10 forks source link

GPU and Train time #11

Closed BetterZH closed 1 year ago

BetterZH commented 1 year ago

Thanks for your excellent work. For NLE models on different datasets (VQA-X, ACT-X, e-SNLI-VE), how many GPUs are required for the pre training and finetune stages?And how many hours are required for the pretraining and finetune stages?

fawazsammani commented 1 year ago

Hi. If I remember correctly, the pretraining was done on 3 GPUs for around 2.5 days. For finetuning, one GPU is enough. It takes a few hours (depending on the dataset - VQAX can finish in 2-3 hours while e-SNLI-VE takes longer, around 6 hours) to finish.

Feel free to open this issue again if you have further questions.