Questions about data sets?

PetarV- / DGI

Deep Graph Infomax (https://arxiv.org/abs/1809.10341)

MIT License

630 stars 135 forks source link

Questions about data sets? #1

Closed linzhi123 closed 5 years ago

linzhi123 commented 5 years ago

Hello, how do you make the data set under the data folder?

PetarV- commented 5 years ago

Hello,

I'm not sure I understand your question. We use standard benchmark datasets for all experiments reported in our paper. For example, Cora, Citeseer and Pubmed can all be found in Thomas Kipf's GCN repository: https://github.com/tkipf/gcn/tree/master/gcn/data

Thanks, Petar

linzhi123 commented 5 years ago

Thanks

linzhi123 commented 5 years ago

I mean how are the files in these datasets made? @PetarV-

PetarV- commented 5 years ago

Perhaps this description can help? https://github.com/kimiyoung/planetoid/blob/master/README.md

I'm sorry I cannot be of much more help than that---I didn't take part in preparing the files.

linzhi123 commented 5 years ago

Ok, thank you for your answer.

svjan5 commented 5 years ago

Hi Petar, Can you provide Reddit and PPI datasets in the format used in the code?

Thanks

PetarV- commented 5 years ago

Hello,

For PPI, the preprocessing code found in:

https://github.com/PetarV-/GAT/blob/master/utils/process_ppi.py

should be enough to get you started.

For Reddit, we were unable to get the PyTorch version of GraphSAGE to cooperate, and thus we used the TensorFlow version:

https://github.com/williamleif/GraphSAGE

as a starting point, and modified it to support DGI and load Reddit. Currently there are no plans to release this modified codebase.

Thanks, Petar

svjan5 commented 5 years ago

Thanks a lot!