cornell-zhang / Polynormer

Polynormer: Polynomial-Expressive Graph Transformer in Linear Time
BSD 3-Clause "New" or "Revised" License
35 stars 1 forks source link

The possibility of applying Polynormer on large-scale datasets #2

Open xioacd99 opened 2 months ago

xioacd99 commented 2 months ago

A simple and elegant work and it seems to be the state-of-the-art graph transformer for node classification.

I notice that the largest dataset used in your paper is ogbn-products with about 2 million nodes, and I wonder if Polynormer can be used on super large datasets, such as obgn-papers100M with about 100 million nodes.

Similar work such as SGFormer gives experimental results on ogbn-arxiv/products/paper100M, so I am confused why Polynormer, which is also a linear transformer, has no experiments on ogbn-papers100M.

Chenhui1016 commented 2 months ago

Hi,

Thanks for your interest!

Yes, Polynormer can also be used on larger datasets, including obgn-papers100M. Polynormer employs a random partition method for mini-batch training, similar to the approach used by SGFormer for scaling to large graphs. By adjusting the batch size, Polynormer effectively prevents the GPU out-of-memory issue, regardless of the size of the underlying graph. While evaluating on even larger datasets is ideal, we believe our experiments with 2M-node graphs have already demonstrated the effective scalability of Polynormer to large graphs through mini-batch training. I hope this answers your question.