Open xioacd99 opened 2 months ago
Hi,
Thanks for your interest!
Yes, Polynormer can also be used on larger datasets, including obgn-papers100M. Polynormer employs a random partition method for mini-batch training, similar to the approach used by SGFormer for scaling to large graphs. By adjusting the batch size, Polynormer effectively prevents the GPU out-of-memory issue, regardless of the size of the underlying graph. While evaluating on even larger datasets is ideal, we believe our experiments with 2M-node graphs have already demonstrated the effective scalability of Polynormer to large graphs through mini-batch training. I hope this answers your question.
A simple and elegant work and it seems to be the state-of-the-art graph transformer for node classification.
I notice that the largest dataset used in your paper is ogbn-products with about 2 million nodes, and I wonder if Polynormer can be used on super large datasets, such as obgn-papers100M with about 100 million nodes.
Similar work such as SGFormer gives experimental results on ogbn-arxiv/products/paper100M, so I am confused why Polynormer, which is also a linear transformer, has no experiments on ogbn-papers100M.