pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.21k stars 3.64k forks source link

Training loss get stuck when train GCNCov model #2820

Open napatnicky opened 3 years ago

napatnicky commented 3 years ago

Hi matthias,

Thank you for your amazing library. I am stuck to train GCNCov model for my own dataset (about 1000 samples and fix adjacency matrix 6888x6888 but different signals with 1 feature to each node ), It's a binary graph classification task . It seems my training loss didn't converge; it stuck at the same value.

Currently ,I'm training a model with batch size = 4, Adam(lr = 0.01) ,hidden_dim = 16 .However, I am trying to adjust batch size and learning rate . It still has the same problem. Do you have any suggestions to overcome this problem ?

Kind regards, Napat

class GCN(nn.Module):
    def __init__(self,dropout,hidden_dim):
        super(GCN, self).__init__()
        self.conv = GCNConv(1,hidden_dim)
        self.bn = nn.BatchNorm1d(hidden_dim)
        self.dropout = dropout
        self.pool = global_mean_pool
        self.classifier = nn.Linear(hidden_dim,1)

    def reset_parameters(self):
        self.conv.reset_parameters()
        self.bn.reset_parameters()
        self.classifier.reset_parameters()
    def forward(self, batched):
        x ,edge_index, batch = batched['x'],batched['edge_index'],batched['batch']
        # 1. embedding 
        x = self.conv(x,edge_index)
        x = self.bn(x)
        x = F.relu(x)
        x = F.dropout(x,self.dropout,training = self.training)
        # 2. global pooling
        x = self.pool(x,batch) # [batch_size,hiddin_dim]
        # 3. classifier
        x = self.classifier(x)

        return F.sigmoid(x)
rusty1s commented 3 years ago

This is hard to say TBH. In general, GCN might be too limited when operating on single node feature values, and you may have better luck with other GNN operators such as GATConv.

Other-wise, you may want to try to heavily overfit on your data first, e.g., by increasing the general hidden dimensionality and the number of layers of your final classifier. Furthermore, the global_mean_pool may lose meaningful features when averaging node embeddings across a larger graph. You may want to try out different pooling operators as well, such as global_add_pool or global_max_pool.

lingchen1991 commented 3 years ago

Have you solved the problem? I meet the same issue. My loss does not decrease and only oscillate. I tried to increase 1) dim of hid, 2) learning rate, 3) make input node features the same as label. Still cannot converge.

rusty1s commented 3 years ago

Can you check other GNN ops as well, such as SAGEConv?

napatnicky commented 3 years ago

@lingchen1991 Yes ,I have . The problems occur when you are dealing with a larger graph and limited features (1 feature, it isn't a good feature as well) for my work. Using a global pooling layer might not be meaningful enough to represent all nodes in a graph to 1 single vector. So , I decided to use a flatten layer instead of global pooling layer. Then, it worked for me.

However, it throws away the structure of the graph in the flatten layer as well ;). I am just dealing with this problem.