VITA-Group / LLaGA

[ICML2024] "LLaGA: Large Language and Graph Assistant", Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, Zhangyang Wang
Apache License 2.0
81 stars 3 forks source link

Resources used during training in paper? #15

Closed AnnabellaMacaluso closed 2 months ago

AnnabellaMacaluso commented 2 months ago

Hi, I was wondering which GPUs were used during training, the approximate total training time and number of GPUS. Thanks!

ChenRunjin commented 2 months ago

I was using an A6000 48GB GPU for training. If I remember correctly, it takes about 15 minutes to train on the Pubmed node classification task with a single GPU, and about 5-6 hours on the Arxiv node classification task HO template (ND takes a bit longer since the prompt will be longer). If you train with multiple GPUs using DeepSpeed, the training time can be reduced proportionally to the number of GPUs with minimal overhead. (e.g. 1.5h with 4 GPUS on arxiv node classification)