Closed chamath-eka closed 3 years ago
Most of the time is spent with WL hashing and that part only uses a single core - more cores are used during the embedding phase. If you read the graphs from the disk you can use the approach that I have here:
https://github.com/benedekrozemberczki/graph2vec
Could you star the repos and hit follow on Github?
On Mon, 16 Aug 2021 at 12:29, Chamath Ekanayake @.***> wrote:
I tried to increase the performance of the Graph2Vec model by using increasing the worker parameter when initializing the model. But it seems that still, the model takes only 1 core to process the fit function.
Is method I have used to assign the workers correct ? Is there another method to improve the performance ?
model = Graph2Vec(workers=28)graphs_list=create_graph_list(graph_df)model.fit(graphs_list)graph_x = model.get_embedding()
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/benedekrozemberczki/karateclub/issues/76, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEETMF4KJ5NYHBA6YTEFRILT5DZCDANCNFSM5CHROMFQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
@benedekrozemberczki Is it possible to optimizes the memory usage when training a graph embedding model like batch processing?
This is an option or workaround to do that, I find it hard to scale for large datasets.
I tried to break a small dataset into smaller parts and try multiple fits but the resulting embeddings are quite different.
I had the same problem here. I also tried to use Graph2Vec to fit a large graph dataset, and it turns out I can't fit a list of all graph into RAM. Batch processing will definitely help, but I'm not sure if it is available in the framework now.
May I ask how do you solve this problem eventually?
In the end, I went with the FEATHER method since the FEATHER is more scalable than Graph2Vec and it is not transductive.
FEATHER scales well, it is indeed inductive and more expressive. It is SOTA on this dataset too: https://openreview.net/forum?id=1xDTDk3XPW (which is extremely cool considering the dataset size).
@1209973 The newest release has an inference functionality.
I tried to increase the performance of the Graph2Vec model by using increasing the worker parameter when initializing the model. But it seems that still, the model takes only 1 core to process the fit function.
Is method I have used to assign the workers correct ? Is there another method to improve the performance ?