awslabs / dgl-lifesci

Python package for graph neural networks in chemistry and biology
Apache License 2.0
696 stars 144 forks source link

Question on residual connection code for GCN #185

Closed kangqiyue closed 1 year ago

kangqiyue commented 1 year ago

Hello, I recently used the GCN model in DGL-LifeSci, and I noticed that "residual connection" is implemented. However, I have some questions on the code implemented in this link:

https://github.com/awslabs/dgl-lifesci/blob/f854adf6f41ec5825f76783fb64c2b59ab520ef5/python/dgllife/model/gnn/gcn.py#L85

      new_feats = self.graph_conv(g, feats)
      if self.residual:
          res_feats = self.activation(self.res_connection(feats))
          new_feats = new_feats + res_feats
      new_feats = self.dropout(new_feats)
      if self.bn:
          new_feats = self.bn_layer(new_feats)
      return new_feats

The residual connection was performed for the features produced after GraphConv layer. However, I think the residual connection should be performed to the original features (named as "feats" in the above code). I think the code should probably be written as follows (maybe):

      new_feats = self.graph_conv(g, feats)
      new_feats = self.activation(new_feats)
      new_feats = self.dropout(new_feats)
      if self.residual:
          new_feats = new_feats + feats
      if self.bn:
          new_feats = self.bn_layer(new_feats)
      return new_feats

Could you check the code or give out the reason why "residual connection" was implemented as the first style, instead of the second style? Thanks!

mufeili commented 1 year ago

The feature size of feats and new_feats are not necessarily the same. Therefore we use a linear layer to project the input features.

kangqiyue commented 1 year ago

@mufeili ok, I understand, thanks!

In addition, I want to remind you that the linear layer was adopted in the class of GCNLayer, therefore one linear layer is created for each layer of the final GCN model. That is what is confusing me, because I think one linear layer for each GCNLayer means adding lots of extra parameters. However, in my case, I just remove the linear layer, considering that I have "fixed" dimensions.

Thanks again!

mufeili commented 1 year ago

Glad to know that!