vtraag / leidenalg

Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
GNU General Public License v3.0
566 stars 76 forks source link

Weighted CPM clustering takes much longer when scale of weights is higher #170

Closed HosseinMA96 closed 4 months ago

HosseinMA96 commented 4 months ago

Hi,

I am using Leiden CPM to cluster a weighted, directed network with ~14 million nodes and ~52 million edges. Initially, the scale of weights was between [0,0.5], and the clustering was as fast as 10 minutes. Now I decided to change the scale of weights and map them to the [1, 10] interval to see the effect.

For some combinations of weights, the code does not finish. I let it run for 7+ hours, but it never finished, while for some combination of weights (as well as the unweighted case) it finishes in 10 minutes.

        kwargs = {'resolution_parameter': resolution_parameter}
        part = la.find_partition(iGraph, la.CPMVertexPartition, seed= seed, weights=pandas_df['weight'], **kwargs)

While still running, following the top command, I can see there is a python process using +90% of a CPU and 20 GB of memory.

Thanks.

vtraag commented 4 months ago

Note that a scaling of the weights will affect the resulting partitions that is similar to scaling the resolution parameter. That is, if you scale the weights by some factor $c$, you should also scale the resolution parameter by the same factor $c$ in order to get the same results (at least for CPM). So, whether you are scaling the weights up (multiplying by $c$) or scaling the resolution (dividing by $c$) will amount to the same thing. In this case, you are multiplying by quite a lot, so you will probably get much coarser clusters if you keep the same resolution parameter. Additionally, you are adding 1 to the weight, for which I don't completely understand the reason.

In short: the resulting partition will be different, and it might be much more difficult for the algorithm to converge to a sensible partition. Hope this makes sense!