pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
20.59k stars 3.57k forks source link

Modification of Graph UNet, but it doesn't work.. #2193

Open hkim716 opened 3 years ago

hkim716 commented 3 years ago

I'm trying to modify GraphUNet to do regression for third column elements in pos[ : , 2 ] with my own datasets. MyData has a lot of graphs, but each graph has same number of nodes and edges. dataset[0] looks like Data(edge_index=[2, 74], pos=[74, 3], test_mask=[74, 1], train_mask=[74, 1]).

I would like to train my Net, but it does not update pred when I try to train with 10 epochs When I print epoch# and loss values, I think the network does not learn anything..

epoch: 0 | total_loss : 0.08977822025347816
epoch: 1 | total_loss : 0.08977822025347816
epoch: 2 | total_loss : 0.08977822025347816
epoch: 3 | total_loss : 0.08977822025347816
epoch: 4 | total_loss : 0.08977822025347816
epoch: 5 | total_loss : 0.08977822025347816
epoch: 6 | total_loss : 0.08977822025347816
epoch: 7 | total_loss : 0.08977822025347816
epoch: 8 | total_loss : 0.08977822025347816
epoch: 9 | total_loss : 0.08977822025347816

Matt, can you help me out? Here is my code.

import os.path as osp

import torch
import torch.nn.functional as F
from MyDataset import MyData
from torch_geometric.nn import GraphUNet
from torch_geometric.utils import dropout_adj
import numpy as np
from torch_geometric.data import DataLoader

dataset = MyData(root='./')
data = dataset[0]

class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        pool_ratios = [32/74, 8/32, 3/8]
        self.unet = GraphUNet(in_channels = 1, hidden_channels=16, out_channels=1,
                              depth=3, pool_ratios=pool_ratios)

    def forward(self, data):
        x = data.pos[:,2].view(-1,1)
        edge_index = data.edge_index
#         batch = data.batch
#         edge_index, _ = dropout_adj(data.edge_index, p=0.2,
#                                     force_undirected=True,
#                                     num_nodes=data.num_nodes,
#                                     training=self.training)
#         x = F.dropout(x, p=0.1, training=self.training)
# ***************************************
        x = x.float()
        x = self.unet(x, edge_index)
        x = x.float()
# ***************************************
        return x

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = Net().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.1, weight_decay=0.001)

def train(dataset):
    loader = DataLoader(dataset, batch_size=1, shuffle=True) 

        # train
    for epoch in range(10):
        total_loss = 0
        model.train()
        for batch in loader:
            optimizer.zero_grad()
            pred = model(batch.to(device))

            label = batch.pos[:,2].view(-1,1)

            loss = torch.autograd.Variable(F.mse_loss(pred, label), requires_grad=True)
            loss.backward()

            optimizer.step()

            total_loss += loss.item()

        total_loss /= len(loader.dataset)
        print(f'epoch: {epoch} | total_loss : {total_loss}')

train(dataset[0])
rusty1s commented 3 years ago

Hi, first of all, the line loss = torch.autograd.Variable(F.mse_loss(pred, label), requires_grad=True) looks really weird to me. In general, you do not need wrap the loss into its own Variable. It should require_grad by default.

Second, you should verify that model parameters receive a gradient, e.g., by looking at list(model.paramaters)[0].grad.

Otherwise, your code looks correct to me, although I'm a bit confused about the actual task. It seems like you simply want to reconstruct node features, which (a) does not look particularly useful to me, (b) might be difficult for GCN layers as they will smooth out node features.

In order to identify any issues with GraphUNet, you should first verify that a more simpler GNN, e.g., stacking a few GCNConv layers, already produces useful results.

hkim716 commented 3 years ago

Thanks Matt, I solved the problem by deleting loss = torch.autograd.Variable(F.mse_loss(pred, label), requires_grad=True) and added

            loss = F.mse_loss(pred.float(), label.float())
            loss.backward()

because it kept giving me errors when dtypes are not matched. Anyways, it is working now, and actually, I have done stacking pyg_nn.GCNConv layers with MyData, and it worked well with your kind help. :)

I need to reduce and then increase the dimension of the nodes (in terms of the number of nodes) so I can create like autoencoder architecture with graph data. I have another question about the architecture of Graph-UNet. I can see that encoder uses TopK-Pooling and GCNConv layers, but how does the decoder recover the dimension of the node numbers because I only see GCNConv layers stacked in the decoder part. In my understanding, GCNConv layer changes the number of node features, not the number of nodes. And also does the decoder actually have trainable parameters when it is unpooled?

rusty1s commented 3 years ago

The Graph-UNet is not really an auto-encoder, as it will use skip-connections between the different levels of coarsened graphs and maintains graph adjacency information from the encoder for the decoder part. There is no real bottleneck here. Unpooling is done by setting node features to zeros for the filtered out nodes and adding initial features as skip-connections.

hkim716 commented 3 years ago
  1. Then what kind of GNN architecture would you use for node feature reconstruction by reducing the number of nodes and recovering them again?
  2. What is the skip-connection in topkPooling? Does it just skip computation like using dropouts?
  3. Is the Unpooling layer in GraphUNet commonly used unpooling layer to be used as decoder to recover the node features with the same number of input nodes?
rusty1s commented 3 years ago
  1. This is an active field of research, which I am not that familar with. A naive approach is to aggregate node features into a global graph representation (GNN + global readout), and then convert your node features back via an MLP mapping [num_global_features] to [num_nodes, num_node_features]. Note that you lose permutation-equivariance that way.
  2. The skip-connection refers to skipping the "coarsening part", where final representations are computed based on initial feature representation + coarsened feature representation. You can read more about it in the Graph U-Nets paper.
  3. That's indeed the most straightforward way to implement unpooling, which omits worrying about the right amount of nodes and reconstructing adjacency information.