qitianwu / NodeFormer

The official implementation of NeurIPS22 spotlight paper "NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification"
286 stars 27 forks source link

the training code for the GCN, SGC models on the ogbn-proteins and amazon2m datasets #5

Closed SiyuanHuangSJTU closed 1 year ago

SiyuanHuangSJTU commented 1 year ago

Hello! The work you have done is very interesting and has inspired me a lot. I am trying to reproduce the results of your paper for the GCN, SGC models on the ogbn-proteins and amazon2m datasets. Can you provide me with the training code in your experiments (similar to main-batch.py, but main-batch.py only contains the training code for nodeformer, not for GCN, SGC) and the associated hyperparameters? This would allow us to better follow your work. Thanks!

qitianwu commented 1 year ago

Hi Siyuan,

Our codes contain the implementation for GCN and SGC in the gnn.py and you can directly call these model by setting the method as sgc or gcn. Notice that we did not use mini-batch for these models, so the training pipeline should be main.py instead of main-batch.py. The hyper-parameters (layer number, hidden size, weight decay) are the same as NodeFormer. Please let me know if you have further problems!

SiyuanHuangSJTU commented 1 year ago

Hello! Really thank you for your quick reply. But I have one more small question. Since I saw table 2 in the paper pointing out the gpu memory used for GCN and SGC training on OGB-Proteins with batch size 10K (2.5 GB and 1.2 GB respectively) I would like to make sure if this is using a batch size of 10K or just using the whole graph for training? As I tried to train with the whole graph, in this case the GCN and SGC used much more than 2.5 GB and 1.2 GB of gpu memory, which led to an OOM situation. Thank you again for your attention on this question !

qitianwu commented 1 year ago

Hi, after double check on this, we indeed used the mini-batch training (the same batch size as NodeFormer) for testing the memory consumption of GCN/SGC reported in Table2/3, which can guarantee a fair comparison. Despite this, the accuracy of GCN/SGC reported in Table2/3 is taken the best among using full-batch training, mini-batch training and the scores reported in the paper (e.g., OGB).

Due to the limited time that disables us to run each baseline method on such large datasets with exhaustive searching space. The baseline models here are put in a relative advantageous position for comparison w.r.t. performance and efficiency, respectively.

SiyuanHuangSJTU commented 1 year ago

Ok I understand. Thank you for your reply!