pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.47k stars 3.68k forks source link

Integrated Spatio-Temporal GNN #3012

Open AmEskandari opened 3 years ago

AmEskandari commented 3 years ago

Hello,

As we talked before about adding Spatio-Temporal GNN models to PyG. I suggest papers that I mentioned below for start. Please take a look at them.

1-Structured Sequence Modeling with Graph Convolutional Recurrent Networks 2-Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting 3-Structural-RNN: Deep Learning on Spatio-Temporal Graphs

Best Regards, Amir

rusty1s commented 3 years ago

Thanks. Most of these models are already implemented in PyTorch Geometric Temporal, but I think it would be great to have an example of using those spatio-temporal GNNs directly in PyG nonetheless. How about we add the second one as an operator to PyG, and provide a clear and minimal example to show-case its usage?

AmEskandari commented 3 years ago

Thanks. Most of these models are already implemented in PyTorch Geometric Temporal, but I think it would be great to have an example of using those spatio-temporal GNNs directly in PyG nonetheless. How about we add the second one as an operator to PyG, and provide a clear and minimal example to show-case its usage?

Do you mean to add examples like these? (https://pytorch-geometric.readthedocs.io/en/latest/notes/colabs.html)

rusty1s commented 3 years ago

I would prefer an example in the examples/ doc.

AmEskandari commented 3 years ago

Ok. I will working on it.

always-ready2learn commented 3 years ago

Hi, Is it possible to define a 2D matrix as edge_attr / node features, as in instead of defining one data row of multiple columns I want to define variable number of rows (keeping columns same) to each edge/node within the graph.

rusty1s commented 3 years ago

Yes, you can store tensors with arbitrary dimensions in data, e.g. edge_attr has shape [num_edges, num_timestamps, num_features].

always-ready2learn commented 3 years ago

Thanks, Can you please give an example?

AmEskandari commented 3 years ago

Hello, I write a naïve and minimal Example of DRNN. Take a look at it.

import torch_geometric_temporal
import torch
from torch_geometric_temporal.dataset import PemsBayDatasetLoader
from torch_geometric_temporal.signal import temporal_signal_split

loader = PemsBayDatasetLoader()
dataset = loader.get_dataset()
train_dataset, test_dataset = temporal_signal_split(dataset, train_ratio=0.7)

Dynamic_node_Features_Train = torch.FloatTensor(train_dataset.features).view(-1,325,24)
Dynamic_node_Features_Test = torch.FloatTensor(test_dataset.features).view(-1,325,24)
Static_edge_index = torch.LongTensor(train_dataset.edge_index)
Static_edge_weight = torch.FloatTensor(train_dataset.edge_weight)

feature_dim = int(Dynamic_node_Features.shape[2])

model = torch_geometric_temporal.nn.recurrent.DCRNN(feature_dim,feature_dim,5)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

def train():
  model.train()
  for epoch in range(100):
    cost = 0
    for time in range(Dynamic_node_Features.shape[0]-1):
        y_hat = model(Dynamic_node_Features_Train[time], Static_edge_index, Static_edge_weight)
        loss = torch.mean((y_hat-Dynamic_node_Features_Train[time+1])**2)
        cost = cost + loss
        if time > 100 and time % 100 == 0 :
          print(f'Mse Loss in Epoch :{epoch} and time : {time} : {loss}')
    cost = cost / (int(Dynamic_node_Features.shape[0]))
    cost.backward()
    optimizer.step()
    optimizer.zero_grad()

@torch.no_grad()
def test():
  model.eval()  
  cost = 0
  for time in range(Dynamic_node_Features_Test.shape[0]-1):
    y_hat = model(Dynamic_node_Features_Train[time], Static_edge_index, Static_edge_weight)
    loss = torch.mean((y_hat-Dynamic_node_Features_Train[time+1])**2)
    cost += loss
    print(f'Mse Loss in time : {time} : {loss}')
rusty1s commented 3 years ago

This is great. Thank you! Please submit it as a PR :)

We may need to add support for DCRNN and the PemsBayDataset dataset directly in PyG, as I would prefer a pure PyTorch/PyG example.

Flunzmas commented 3 years ago

Most of these models are already implemented in PyTorch Geometric Temporal, but I think it would be great to have an example of using those spatio-temporal GNNs directly in PyG nonetheless. How about we add the second one as an operator to PyG, and provide a clear and minimal example to show-case its usage?

We may need to add support for DCRNN and the PemsBayDataset dataset directly in PyG, as I would prefer a pure PyTorch/PyG example.

I'm currently using both PyG and PyG-Temporal for my projects on spatio-temporal graphs, and I feel like the interplay between PyG's Data/Batch and PyGT's Signal/SignalBatch objects currently has its caveats. I think that either the functionality for dealing with spatio-temporal graphs needs to be fully externalized to PyGT (and their API re-adjusted to align with PyG), or PyGT needs to be fully "re-implemented" in PyG.

From your comments it seems as if you'd prefer the second solution, am I right with this assumption?

rusty1s commented 3 years ago

I'm really interested to hear more about the caveats between Data in PyG and Signal in PyGT. Please elaborate.

I think from my perspective, I'm not so interested in replicating the complete set of functionality from PyGT in PyG, but more interested in providing the necessary tools to allow for efficient updates to a graph from a constant stream of events. For example, such functionality is currently deeply hidden in the TGNMemory module. It would be great to have a Data object exposed to the user that can be automatically updated, and which can be used to easily sample based on temporal information, e.g., sample neighbors with higher probability in case they were just recently added.

Flunzmas commented 3 years ago

An example I encountered:

Converting a temporal sequence of graph data that is stored in Data objects into a Signal object works as intended, as does the extraction of individual snapshots from that Signal. However, when aiming at doing the same with batched graph data, the corresponding SignalBatch objects, when iterated over, yield Batch objects constructed with batch = Batch(...) instead of batch = Batch.from_data_list(...) (see e.g. here). This means that the yielded Batch objects can be used much less flexibly (e.g. no iteration/de-batching).

This can be seen as nitpicking, but over the long run people who are using PyG+PyGT would prob. want more comfortable methods of conversion between the non-temporal and the temporal types :).

rusty1s commented 3 years ago

I see. Integrating such Signal data objects directly in PyG would allow us to fix such inconsistencies between both libraries more conveniently. From PyGT part, there are some private attributes missing to allow for de-batching (https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/data/batch.py#L78-L80).

Pinging @benedekrozemberczki here to see if he has more thoughts on the interplay between both libraries.