vtraag / leidenalg

Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
GNU General Public License v3.0
596 stars 78 forks source link

Non-Deterministic selection of Community #77

Closed CxVercility closed 2 years ago

CxVercility commented 3 years ago

In the paper "From Louvain to Leiden: guaranteeing well-connected communities" it is mentioned that during the refinement phase the target community should/could(?) be chosen with a certain degree of randomness. However, it seems to me that this implementation (and igraph's implementation) do not do this, but choose to just take the best (aka highest modularity difference). I was wondering if there was a specific reason why this approach was ditched, as I am currently working on my own (parallel) implementation of the algorithm.

I'm asking specifically because it isn't clear to me why the mentioned approach should even work in the first place, and i've not gotten it to work properly in my code either (or it just made no difference). Since the theta parameter is a constant, the resulting exponent would continuously grow smaller as the graph size increases (or rather, the individual modularity deltas decrease). Doesn't this effectively turn it into a more or less true-random selection for big graphs? E.g consider a theta of 1/100 as mentioned (so a multiplier of 100) and modularity differences <= 1/100000. This will result in exponents of 1/100 at most, which makes any weight 1.01 at most.

Thanks in advance.