Hoosier-Clusters / clusim

An extended package for clustering similarity
MIT License
63 stars 15 forks source link

Speed up in constructing network #38

Closed jisungyoon closed 4 years ago

jisungyoon commented 4 years ago

Hi! I am Jisung Yoon from POSTECH. Element-centric sim is awesome, and it works very well on overlapping community detection!

In this PR, I change some codes for constructing network (project the bipartite network)

In the previous code, you use the full matrix in calculating dot products and constructing networks. It works well on small samples, but it takes an extremely long time in large samples.

I put some test notebook under the test codes, and I even cannot get the network because of the memory errors (It consumes more than 300 GB)

In the previous trial, it takes almost 1 hour and 30 minutes to calculate the network with using full cores (64 cores). But, with a new implementation, it takes only 2 minutes with small number of cores.

Please take a look, let me know what you think:) @ajgates42 @yy

Thanks very much!

ajgates42 commented 4 years ago

Hi @jisungyoon, This looks great. Thanks for the improvement!

jisungyoon commented 4 years ago

@yy Yeah, I just replace the codes, because it is always fast and return exactly the same results. Do we need to keep the legacy code? @ajgates42

yy commented 4 years ago

Ok, I've removed the option. Now the thing is that the notebook will not run. Can we just keep the benchmark results here as a record but not the notebook?

jisungyoon commented 4 years ago
Screen Shot 2020-05-14 at 1 59 12 PM Screen Shot 2020-05-14 at 1 59 19 PM Screen Shot 2020-05-14 at 1 59 25 PM
jisungyoon commented 4 years ago

For archiving (legacy code)
proj1 = bipartite_adj / bipartite_adj.sum(axis=1) proj2 = bipartite_adj / bipartite_adj.sum(axis=0) projected_adj = proj1.dot(proj2.T) cielg = igraph.Graph.Weighted_Adjacency(projected_adj.tolist(), mode=igraph.ADJ_DIRECTED, attr="weight", loops=True)

yy commented 4 years ago

@ajgates42 merge?

ajgates42 commented 4 years ago

Nice, thanks @jisungyoon !