graspologic-org / graspologic

Python package for graph statistics
https://graspologic-org.github.io/graspologic/
MIT License
764 stars 143 forks source link

Parallelization on MASE #751

Open loftusa opened 3 years ago

loftusa commented 3 years ago

in the initial embedding step (ASE on all the graphs separately) prior to concatenation, MASE just does all the embeddings in a for-loop. This can be parallelized pretty easily with joblib. I can make a PR, but might also be fun for someone else to play around with (@rajpratyush?). Should be pretty straightforward, you can just parallelize the loop right?

this line of code

rajpratyush commented 3 years ago

Sureee

loftusa commented 3 years ago

@rajpratyush if you're up for it, it'd be great to see some figures comparing runtime in different situations with+without parallelization in the PR as well. Could be fun as well :)

rajpratyush commented 3 years ago

Sure let me try it

rajpratyush commented 3 years ago

I was going through the documentation of Joblib right now. and I think this should be the proper replacement for the code that you had mentioned. from joblib import Parallel, delayed embeddings = [ Parallel(n_jobs=2)(delayed(selectSVD( graph, n_components=n_components, algorithm=self.algorithm, n_iter=self.n_iter, ) for graph in graphs) ]

loftusa commented 3 years ago

I was going through the documentation of Joblib right now. and I think this should be the proper replacement for the code that you had mentioned. from joblib import Parallel, delayed embeddings = [ Parallel(n_jobs=2)(delayed(selectSVD( graph, n_components=n_components, algorithm=self.algorithm, n_iter=self.n_iter, ) for graph in graphs) ]

Yeah, that looks right! See this older code in the case branch for another example

rajpratyush commented 3 years ago

So i will push the PR tommorrow morning