vtraag / leidenalg

Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
GNU General Public License v3.0
606 stars 78 forks source link

Endless loop in in optimisers #126

Closed geniusadventurer closed 1 year ago

geniusadventurer commented 1 year ago

Thank you for the package, it's very convenient to use. However, I found that sometimes when using the function optimise_partition_multiplex in Optimiser.py under some parameters, it will lead to an endless loop. Sometimes it's due to the very small difference value in this function (for example, 1e-14, very close to zero) when I set n_iterations<0. So I think you can add a default threshold parameter (e.g. 1e-5) so when the difference is very small, it can stop iterating and output a result. I propose the same suggestion for other functions. But another situation is when the function uses _c_leiden._Optimiser_optimise_partition_multiplex function, it will be endless under some parameters and I don't know what happened in this function. Maybe also an issue like lacking a threshold? Sometimes I need to restart the whole process to get all my results due to this issue. Thank you!

def leiden(resolution, result_type, G_prox, G_od, nodes, resolution_list=None):
    assert result_type in ['prox', 'od', 'multi']

    if result_type == 'prox' or result_type == 'od':
        assert resolution_list is None
        if result_type == 'prox':
            G = G_prox
        else:
            G = G_od
        partition = la.find_partition(G, la.CPMVertexPartition, resolution_parameter=resolution)  # Single leiden
        res_partition = {}
        c_set = set(partition.membership)
        for node in nodes:
            res_partition[node] = partition.membership[nodes.index(node)]
    else:
        assert resolution is None
        optimiser = la.Optimiser()
        part_prox = la.CPMVertexPartition(G_prox, resolution_parameter=resolution_list[0])
        part_od = la.CPMVertexPartition(G_od, resolution_parameter=resolution_list[1])
        # Set n_iterations to 30 to avoid endless loop.
        optimiser.optimise_partition_multiplex([part_prox, part_od], layer_weights=resolution_list[2:], n_iterations=30)
        res_partition = {}
        c_set = set(part_prox.membership)
        for node in nodes:
            res_partition[node] = part_prox.membership[nodes.index(node)]
    return res_partition, len(c_set)
vtraag commented 1 year ago

In principle, it should not be possible for it to be an endless loop. However, it should be possible for it to take a very, very long time. For that reason, I would simply suggest setting some very large (but finite) number of iterations, instead of allowing potentially near infinite number of iterations.

I will not include such a type of threshold approach, but you can easily implement that yourself in Python directly. The optimise_partition_multiplex function returns the difference in the quality function it found. Hence, if you simply do a loop like

while delta > 1e-5:
    delta = optimiser.optimise_partition_multiplex( ... )

you essentially get the effect you are after.

Of course, it might be that there is some bug somewhere that would lead to an infinite loop. But I would then need more information to be able to reproduce the problem.

vtraag commented 1 year ago

I'll close it for now. If you are able to find a reproducible example that leads to an infinite loop, feel free to re-open!