williamleif / GraphSAGE

Representation learning on large graphs using stochastic graph convolutions.
Other
3.43k stars 844 forks source link

how to change the dataset #200

Open madewijayya opened 1 year ago

madewijayya commented 1 year ago

if i want to change the dataset, how ?

BulBul-559 commented 1 year ago

You need to organize your data like the example data. Specifically, if you use the unsupervised, at least three files you should prepare:

  1. G.json
  2. id_map.json
  3. feats.npy

According to my experience, G.json you can use NextworkX Graph in Python to save it as a JSON file. id_map is used as 0->3, 1->8, 2->10, and so on, the left is your node's id, and the right is continuous id, so you can generate this according to the node's id which in the G.json file. At the end, feats.npy stored the features of nodes.

Although unsupervised doesn't need class_map.json, it will error if you run the code directly. In my way was changed the utils.py, and I let the class_map always be None in the load_data function.

The last link before you run the code by unsupervised is generate the walk.txt. You can use the python ./graphsage/utils.py your_input_prefix-G.json your_output_prefix-walks.txt.

That's all I knows, hope it can help you.