Closed jexterliangsufe closed 11 months ago
(1) Are models with GNN really better than models without GNN?
Not necessarily, there is indeed a lot of work that has recently attempted to solve spatio-temporal forecasting problems by methods without graphical neural networks, e.g. [1][2]
[1] A Simple yet Effective Baseline for Multivariate Time Series Forecasting [2] SimST: A GNN-Free Spatio-Temporal Learning Framework for Traffic Forecasting
TimesNet articles of this type usually do long-range time series forecasting, and there is recent work that shows that this type of work is not as good as models that take spatio-temporal factors into account for forecasting [3] [3] HUTFormer: Hierarchical U-Net Transformer for Long-Term Traffic Forecasting
So this can only be said to be a matter of opinion, and we all have different perspectives
(2) The number of nodes is often much greater than datasets in paper works.
For research papers, 1024 nodes is almost the largest graph structure, unless it is some articles dealing specifically with traffic prediction for large graphs, where larger graphs are used for prediction via graph decomposition. For real life, 1024 nodes is indeed too few, and there is actually a generation gap between current research and industrial applications. Are you seeing which paper uses a particularly high number of data nodes? Can you share it with me?
(3) How can I use your model to solve such problems?
PDformer, GMAN and other models based on graph attention are really not suitable for processing large graphs (limited by efficiency), you can consider combining the operation of Graph Transformer computation on a large graph to update this model so that he can handle large graphs.
Thanks for your kind reply! I learned a lot.
Models do need engineering optimization to video larger scale real world applications. Some of the work doing research on graph networks has attempted to make larger scale graph structures:
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification Graphsaint: Graph sampling based inductive learning method Cluster-gcn:An efficient algorithm for training deep and large graph convolutional networks GNN-autoscale: Scalableand expressive graph neural networks via historical embeddings
Combining these efforts with a model for time-series prediction should enable large-scale prediction. Large scale prediction is indeed a challenge and can be attempted as a follow up work~
One can also try to use the GNN-free model mentioned above for industrial applications.
Thanks a lot! I am greatly inspired by your reply.
By the way, I have a technical question. The data of spatio-tempral problem has one more dimension than the data of spatio problem(B T N D vs B N * D). Assuming that the latter can directly apply a gnn conv layer(GCNConv), the former needs to loop through T times. Meanwhile, [1]'s code implementation uses Einstein summation convention(torch.einsum('btnd,nm->btmd', x, A)) to replace the loop. [1] Graph WaveNet for Deep Spatial-Temporal Graph Modeling
Einstein summation does not seem to be possible with sparse matrix operations, which means this won't work with large graphs because of excessive sizes of A. Do you have any suggestion on it?
Hi, great works! I have questions about datasets mentioned in your paper and models which you use to compare with PDFormer.
I noticed the max number of nodes of datasets is 1024(T-Drive), which is not much greater than the number of variates in some newest Seq2Seq models(e.g. TimesNet, Autoformer, Informer, ...). In TimesNet paper, it compared TimesNet with other Seq2Seq models(But not GNN+Seq2Seq models) on a traffic dataset and achieved SOTA. Are models with GNN really better than models without GNN?
By the way, the number of nodes is often much greater than datasets in paper works. How can I use your model to solve such problems? Thanks anyway!