CarloLucibello / GraphNeuralNetworks.jl

Graph Neural Networks in Julia
https://carlolucibello.github.io/GraphNeuralNetworks.jl/dev/
MIT License
215 stars 46 forks source link

Question about the GCNConv layer code #211

Closed 5tinzi closed 1 year ago

5tinzi commented 1 year ago

Thanks for such wonderful project, I get some troubles in follow the idea of GCNConv layer code.

Allow me to begin by presenting my understanding of the GCN. From the GCN paper Semi-supervised Classification with Graph Convolutional Networks, The computation representation of the GCN layer is simple as:

$$ \displaylines{ A = \hat{D}^{-\frac{1}{2}} \hat{A} \hat{D}^{-\frac{1}{2}} \ X^{l+1} = f(W, A) = \sigma(AX^{l} W) } $$

where $A$ is a normalized graph Laplacian could be first calculated in a pre-processing step. $X^{l+1}$ is the output of layer l .

Intuitively a GCNConv could be build with the following code (pseudo):

function  pre_processing(g)  
  A_hat = adj_Matrix(g) + I # I is an identity matrix, for node self-connect 
  d = degree_Matrix(g)
  d_hat = 1 ./ sqrt.(d)
  return d_hat * A_hat * d_hat
end

struct GCNConv
  w
  A
end

# constructor  
function GCNconv(g,in, out)   
  A =  pre_processing(g) 
  w = randn(out,in)
  GCNconv(w, A)
end 

#  layer calculation 
function (l::GCNconv)(x)  
    sigma(l.A * l.w * x)
end 

then we can define and apply a GCN layer as follow:

#give a graph g, and a  node feature array X
l = GCNconv(g, in, out )
l(x)  # one layer GCN.
l(l(x)) # two layer GCN.

The above is my very personal understanding. Maybe it's some preconceived notion, I find the GraphNeuralNetworks' GCNConv design is a bit difficult to follow:

  1. In the GraphNeuralNetworks GCNConv layer,it seems there are no pre-processing step for something like $\hat{D}^{-\frac{1}{2}} \hat{A} \hat{D}^{-\frac{1}{2}}$ , so it seems each time we call a GCNConv layer, it will do these repeat caculate steps, but during the model training, hundreds of thousands of runs is need, so will this slow down the speed?
  2. another question is that GCNConv do a repeat x = x .* c'(in line 102 and 110 ) , so it seems it performs $x \hat{D}^{-\frac{1}{2}} \hat{D}^{-\frac{1}{2}}$ instead of $\hat{D}^{-\frac{1}{2}} \hat{A} \hat{D}^{-\frac{1}{2}}$. it is hard to follow the idea here, any tips or suggestion here? https://github.com/CarloLucibello/GraphNeuralNetworks.jl/blob/afd80e8024abb270db79cd8592376ba60bc63f60/src/layers/conv.jl#L100-L110

Any advice would be appreciated and thank you for your patience.

CarloLucibello commented 1 year ago

Hi @5tinzi, your implementation is actually correct and functionally equivalent to the one in this repo. The one in this repo is complicated by the fact that you don't want to materialize large dense matrices but do gather/scatter operations or sparse matrix multiplications and you want to support edge weights if present.