Extra information about what is happening after the last GCN layer

FilippoMB / Variational-Graph-Auto-encoders-Tensorflow-2-Spektral-

MIT License

14 stars 1 forks source link

Extra information about what is happening after the last GCN layer #1

Open StefanBloemheuvel opened 2 years ago

StefanBloemheuvel commented 2 years ago

Hi,

This looks really interesting! I was reading into the paper from Kipf but there only was a tensorflow v1 implementation so this is perfect for experimentation and understanding.

I do have one question, what is exactly happening in the following lines of code?

out = tf.matmul(z, tf.transpose(z))
A_rec = tf.keras.layers.Activation('sigmoid')(out)
out = tf.reshape(out, [-1])

It seems that first you multiply encoded embedding called 'z' with its transpose. That would collect information from neighbouring nodes? (by looking at this stackoverflow example)

Also, my interest is using these techniques for further analysis of the embedded features, so I think that I need do something different then:

A_rec = tf.keras.layers.Activation('sigmoid')(out)
out = tf.reshape(out, [-1])

if link prediction is not my goal? For example, clustering, classification, etc of the graph or the nodes.

Kind regards and thanks for this work!

Stefan

FilippoMB commented 2 years ago

Hi Stefan,

Question 1:

The reconstructed adjacency matrix is obtained as sigmoid(z z^T). There's the equation in the paper and it makes sense: there should be an edge e_ij if the node embeddings z_i and z_j are similar, i.e., their inner product <z_i, z_j^T> is large.
Since I want to weight differently the predictions on the 0 and 1 (i.e., absence vs. presence of edges), I use the loss function "tf.nn.weighted_cross_entropy_with_logits"
Such a loss function takes the logits, which in our case are zz^T (note that I need to reshape zz^T from [NxN] to [N^2,1]). That's the reason for the third line: out = tf.reshape(out, [-1])

Question 2: Let's say you want to cluster the nodes. You can do something like this after we trained the GAE model:

# retrieve the node embeddings from the trained model
_, _, node_emb = model([X,A]) 
node_emb = node_emb.numpy()

# Cluster them with k-means
from sklearn.cluster import KMeans
clust_algo = KMeans(n_clusters=7)
clust_algo.fit(node_emb)
cluster_labels = clust_algo.labels_

StefanBloemheuvel commented 2 years ago

Hi @FilippoMB,

Thanks for the very quick reply, I will dive into everything you mentioned, this really cool! I have not seen such usage of the sigmoid function before.

However, this does get me thinking that If I would like to work on the embeddings of the node features, perhaps reconstructing the adjacency matrix is not the best way to go (this also does not work if the edges are not binary but weighted right?), maybe I should rewrite this model to reconstruct the features themselves. My initial guess is that that would improve the quality of the embedding for such a task. I will try to figure a way out to achieve this.

One last question, what is your result when applied to the test set? On my machine around 58-62% depending on the amount of epochs of course.

FilippoMB commented 2 years ago

Hi Stefan,

sorry for the late reply.

There are many ways to obtain node embeddings in an unsupervised manner. Doing link prediction with the VGAE allows you to generate node embeddings that:

are similar if the original node features are similar
are similar to each other if two nodes are connected There are many other approaches for obtaining meaningful node embedding. For example, consider all the methods in this library: https://karateclub.readthedocs.io/en/latest/

I would say that if your edges are weighted rather than binary, the VGAE approach could still work. In particular, if the edge weights are normalized in [0,1] you can still use the sigmoid.

I just run the code on my machine and for the GAE I obtained

AUC: 99.0, AP: 98.3
Test accuracy: 59.9

while with the VGAE I obtained

AUC: 98.9, AP: 97.7
Test accuracy: 57.0

Of course, the results change slightly from run to run.