Training on really large graphs

dabidou025 commented 2 years ago

Hello, I am currently using your implementation for link prediction of scientific papers on a really large graph for a school project, with more than 100k nodes and 1 million edges.

I am currently using Kaggle's notebook and free GPU, and even the 'Enclosing subgraph extraction' part takes a really long time to run (0% 0/8 at 30 minutes of runtime).

Could you suggest parameters, tips and tricks to make the training feasible for a student that doesn't have access to huge GPUs and a limited access to one ?

Of course I don't expect a miracle solution but I hope that you can put me on the right track =).

Thanks you for your work !

muhanzhang commented 2 years ago

Hi! I suggest using the latest implementation of SEAL using pytorch geometric. It is much more efficient on large graphs and support customized datasets, too. That repository also implements some methods to reduce training/inference time, which are discussed in the issues.

ccMrCaesar commented 2 years ago

@dabidou025 Hello! I wonders did you succeed in running SEAL in a really large graph from a given edge list? I currently met similar problems for a school project but I'm not familiar with pytorch geometric. Thanks!

muhanzhang / SEAL

Training on really large graphs #67