Closed AnnabellaMacaluso closed 2 months ago
I was using an A6000 48GB GPU for training. If I remember correctly, it takes about 15 minutes to train on the Pubmed node classification task with a single GPU, and about 5-6 hours on the Arxiv node classification task HO template (ND takes a bit longer since the prompt will be longer). If you train with multiple GPUs using DeepSpeed, the training time can be reduced proportionally to the number of GPUs with minimal overhead. (e.g. 1.5h with 4 GPUS on arxiv node classification)
Hi, I was wondering which GPUs were used during training, the approximate total training time and number of GPUS. Thanks!