Closed linzhi123 closed 5 years ago
Hello,
I'm not sure I understand your question. We use standard benchmark datasets for all experiments reported in our paper. For example, Cora, Citeseer and Pubmed can all be found in Thomas Kipf's GCN repository: https://github.com/tkipf/gcn/tree/master/gcn/data
Thanks, Petar
Thanks
I mean how are the files in these datasets made? @PetarV-
Perhaps this description can help? https://github.com/kimiyoung/planetoid/blob/master/README.md
I'm sorry I cannot be of much more help than that---I didn't take part in preparing the files.
Ok, thank you for your answer.
Hi Petar, Can you provide Reddit and PPI datasets in the format used in the code?
Thanks
Hello,
For PPI, the preprocessing code found in:
https://github.com/PetarV-/GAT/blob/master/utils/process_ppi.py
should be enough to get you started.
For Reddit, we were unable to get the PyTorch version of GraphSAGE to cooperate, and thus we used the TensorFlow version:
https://github.com/williamleif/GraphSAGE
as a starting point, and modified it to support DGI and load Reddit. Currently there are no plans to release this modified codebase.
Thanks, Petar
Thanks a lot!
Hello, how do you make the data set under the data folder?