muhanzhang / SEAL

SEAL (learning from Subgraphs, Embeddings, and Attributes for Link prediction). "M. Zhang, Y. Chen, Link Prediction Based on Graph Neural Networks, NeurIPS 2018 spotlight".
587 stars 138 forks source link

Bug during trying on a large graph #75

Open ccMrCaesar opened 2 years ago

ccMrCaesar commented 2 years ago

Hi, Dr. Zhang Hello, I am currently using SEAL for link prediction on a really large graph with more than 100k nodes and 1 million edges for study purpose. After the first "Enclosing subgraph extraction" almost done (97%) , a bug occured:

" File "/lustre/home/Desktop/SEAL/Python/util_functions.py", line 163, in subgraph_extraction_labeling nodes.remove(ind[1]) KeyError: 17196 "

And more informations like

"The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/lustre/home/Desktop/my_seal/Python/Main.py", line 160, in train_graphs, test_graphs, max_n_label = links2subgraphs( File "/lustre/home/Desktop/my_seal/Python/util_functions.py", line 132, in links2subgraphs train_graphs = helper(A, train_pos, 1) + helper(A, train_neg, 0) File "/lustre/home/Desktop/my_seal/Python/util_functions.py", line 118, in helper results = results.get() File "/lustre/opt/cascadelake/linux-centos7-x86_64/gcc-4.8.5/miniconda3-4.8.2-5yczksexambgeule63z3smwiwrbokjtu/envs/mytorch/lib/python3.9/multiprocessing/pool.py", line 771, in get raise self._value KeyError: 17196 " Show it seems like something happens in multiprocessing model.

I have tried command in your README file :"python Main.py --train-name PB_train.txt --test-name PB_val.txt --hop 1 --save-model" and it works just fine. But after using my edge list file, it happens. I also tried with or without the "--max-nodes-per-hop 100" command but there still a problem.

I notice that you suggest using PytorchGeometric to deal with the large graph, I just wonder whether it is fixable or any clue about it?

Thanks you for your work !

xihairanfeng commented 2 years ago

I'm not sure what's going on, but does your node pair have a situation where the left and right nodes are the same node? If so, please remove the same piece of data on the left and right nodes and try again