Closed rmwu closed 3 years ago
Thanks for using SubGNN! Below are answers to your questions 1 & 4. Let me look into questions 2 & 3 and get back to you.
Memory for SubGNN scales as a function of the number of subgraphs. Your dataset 16k subgraphs is actually larger than the datasets we used in our paper. Are you able to run with a smaller batch size?
Good catch, we recently refactored the code to make it easier for others to use, but clearly missed a few bugs. I've updated the code base to use graphsaint_gcn_embeddings.pth
for everything.
Hello, thanks for your sharing! When I ran on the ppi_bp dataset, I also encountered this bug: "KeyError: 17605" when computing degrees "graph_degree_seq = [degree_dict[n-1] for n in nodes]" Would you mind helping me deal with it?
Hi @leaf-ygq and @rmwu, I figured out what was causing the bug. We had uploaded a degree_sequence.txt
file in the dropbox corresponding to an older version of the PPI network. Please redownload the ppi_bp
folder from Dropbox and reopen an issue if you have any more trouble.
@rmwu - I'm not able to reproduce the bug you mention in (3). It's possible that it resulted from a mismatch in environments. Hopefully this issue doesn't appear with the updated conda env, but please let me know if you keep on running into this issue.
For (3), I ran into the same issue, changing line 99 in SubGNN.py to the following seem to work.
self.to(torch.device('cuda' if torch.cuda.is_available() else 'cpu'))
Update: the above is true for pytorch_lightning v1.0.7. For v0.7.1, the original code works.
Hello! I was running your codebase and came across several bugs, as well as memory issues.
1) I processed a dataset of my own with 21584 vertices and 342685 edges, as well as 16734 subgraph labels. Running the code on this, with a batch size of 128, ran into memory issues: "RuntimeError: CUDA out of memory. Tried to allocate 96.11 GiB (GPU 0; 10.92 GiB total capacity; 5.34 MiB already allocated; 10.40 GiB free; 22.00 MiB reserved in total by PyTorch)"
2) when I run on the ppi_bp dataset, downloaded from Dropbox, and the provided config json, line 45 of gamma.py raises the error "KeyError: 21114" when computing degrees "graph_degree_seq = [degree_dict[n-1] for n in nodes]"
3) line 100 of SubGNN.py raises the error "RuntimeError: Cannot set the device explicitly. Please use module.to(new_device)." This suggests that self.device could not be used as a variable. I'm not sure whether this is due to differing versions of the packages, as I installed them myself, but I changed this line to self.device_num
4) data preprocessing code saves graphsaint embeddings as "gcn_graphsaint_embeddings.pth" but line 229 of train_config.py loads them as "graphsaint_gcn_embeddings.pth"
Thank you for your help!