thunlp / OpenNE

An Open-Source Package for Network Embedding (NE)
MIT License
1.68k stars 485 forks source link

How to process the wiki dataset so that it can run on GCN? #118

Closed jzm-123 closed 2 years ago

Bznkxs commented 2 years ago

Hi @jzm-123 !

Raw GCN only accepts attributed graphs (graphs with feature). Since wiki dataset does not contain node features, a straight-forward method is to create a node feature file, and load it as a self-defined dataset.

The format of feature file should be: each line describes one node, in which the first number is the node number, and the following numbers are node feature vectors (which in this case you can leave empty), like:

0
1
2
3
...

However, I advise that you modify class Model in models/gcn/gcnModel a bit to attach a trainable nn.Embedding to each node before you train with GCN. Specifically, create self.embeddings member and in forward, add self.input = self.embeddings(inputs).