pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.09k stars 3.63k forks source link

[Roadmap] Temporal Graph Support πŸš€ #3230

Open rusty1s opened 3 years ago

rusty1s commented 3 years ago

Updated on Nov 27th.

This is a roadmap towards better temporal data and temporal GNN support in PyG (targeted for v2.5), which spans across data handling, data loading, models and examples.

Data Handling

Previously, we used TemporalData as our abstraction to represent temporal graphs. However, this data structure is limited, i.e. it can only hold a stream of events, it mixes homogeneous and heterogeneous data, it cannot naturally deal with node features, etc. As such, we want to deprecate TemporalData and move time handling to Data and HeteroData explicitly.

Open Questions:

Data Loading

With Data and HeteroData having time support, we can utilize PyG's data loaders to take care of temporal subgraph sampling. Importantly, NeighborLoader and LinkNeighborLoader have been already equipped with temporal sampling support.

Models

Currently, we only expose TGN as a temporal GNN model, which currently operates on TemporalData and its own sampler implementation.

Examples and Tutorials

Afterwards, we need clear and descriptive examples and tutorials on how to leverage PyG for node-level and link-level temporal applications.

Advanced Features

One current limitation of most temporal GNN models is that they learn an embedding per node, which gets updated over time. However, it is not necessarily scalable to keep this embedding on the GPU due to memory constraints. As such, advanced features may include data-structures to query and train these embeddings on CPU, with efficient asynchronous device transfers to avoid host2device/device2host bottlenecks. An example of such a data-structure may be inspired by the GNNAutoScale paper.

SalvishGoomanee commented 1 year ago

Thanks @rusty1s!

I can update the TemporalDataclass in torch_geometric/data/temporal.pyand submit a pull request? (Unless there's already some work on this but I am not aware of any at the moment).

rusty1s commented 1 year ago

Sounds wonderful, thank you!

SalvishGoomanee commented 1 year ago

Done!

otaviocx commented 1 year ago

Thanks, @SalvishGoomanee!

Could you please link the PR?

SalvishGoomanee commented 1 year ago

Yes here it is: https://github.com/pyg-team/pytorch_geometric/pull/7573

rajveer43 commented 1 year ago

@rusty1s Support for temporal GNN layers and models (including examples) would like to work on thsi if anyone is not working!

rusty1s commented 1 year ago

Cool :) We are currently targeting to provide better support for "Temporal Graph Benchmark (TGB)" in PyG, but no ones has started to work on this yet.

rajveer43 commented 1 year ago

Cool :) We are currently targeting to provide better support for "Temporal Graph Benchmark (TGB)" in PyG, but no ones has started to work on this yet.

Okay Would Work on this from Monday! I know how to code it.. would you just tell me where I can Exactly Put the code? locations of the file. which files to edit?

DamianSzwichtenberg commented 10 months ago

I'm working on merging TemporalData with Data/HeteroData. Draft PR available here: #8454

winun1127 commented 5 months ago

For now, when dealing with CTDG, is it ok to use TemporalData, TemporalDataLoader, and TGN?

Or do you recommend using Data instead of TemporalData?

rusty1s commented 5 months ago

We are very slowly migrating away from TemporalData to a world where Data is also the preferred choice of class to represent temporal graphs. For now, however, you are ok with using TemporalData.