vtraag / leidenalg

Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
GNU General Public License v3.0
596 stars 78 forks source link

unordered version of temporal community detection #67

Closed mb3152 closed 2 years ago

mb3152 commented 3 years ago

Would it be easy / possible to build a wrapper that links nodes across all layers? This would be good for say, a bunch of different brain networks from different subjects. So, no temporal order; you want a node to be linked to itself across all layers, not just the graph before and after it in the array of graphs. They support this in the GenLouvain MATLAB library.

vtraag commented 3 years ago

You can do this using the function slices_to_layers. This is just a generic function where a coupling graph allows you to specify how nodes across different slices should be connected. Each slice is a node of the coupling graph, and the connections between the nodes of the coupling graph indicate how the slices should be connected.

For example, for connecting three slices all between each other, you can use

H_coupling = ig.Graph.Full(3)
H_coupling.vs['slice'] = [H1, H2, H3]
H_coupling.es['weight'] = 1
G_layers, G_interslice, G_full = leidenalg.slices_to_layers(H_coupling)

You can then create the necessary partition and optimise them, this is also explained in the more detailed section on converting slices to layers.

Is this a situation that many people will encounter? If so, I could add a convenience function, similar to find_partition_temporal

mb3152 commented 3 years ago

Ah, I see how that works. Just to ensure I am doing it correctly, H1, H2, and H3 will only have links between layers from a node to itself, not to other nodes?

I assume many people would use it for any unordered multi-slice network. Quite common in human neuroscience. Would greatly appreciated a convenience function!

Thank you for your time.

vtraag commented 3 years ago

H1, H2, and H3 will only have links between layers from a node to itself, not to other nodes?

Yep, that's right, it only connects nodes with similar node identifiers. The attribute that is used to identify "the same node" across multiple slices can be specified using the vertex_id_attr. See the mentioned docs more explanation.

I assume many people would use it for any unordered multi-slice network. Quite common in human neuroscience. Would greatly appreciated a convenience function!

I'll think about an option to make this type of thing simpler! I'll leave issue open to remind me.

mb3152 commented 3 years ago

thank you!!

mb3152 commented 3 years ago

I edited your function to just take in the graphs and layers/slices

def find_partition(graphs,G_layers, G_interslice, G, partition_type,slice_attr='slice', vertex_id_attr='id',edge_type_attr='type', weight_attr='weight',n_iterations=2, max_comm_size=0, seed=None,**kwargs):
    # Optimise partitions
    arg_dict = {}
    if 'node_sizes' in partition_type.__init__.__code__.co_varnames:
        arg_dict['node_sizes'] = 'node_size'

    if 'weights' in partition_type.__init__.__code__.co_varnames:
        arg_dict['weights'] = 'weight'

    arg_dict.update(kwargs)

    partitions = []
    for H in G_layers:
        arg_dict['graph'] = H
        partitions.append(partition_type(**arg_dict))

    # We can always take the same interslice partition, as this should have no
    # cost in the optimisation.
    partition_interslice = la.CPMVertexPartition(G_interslice, resolution_parameter=0,
                                            node_sizes='node_size', weights=weight_attr)
    optimiser = la.Optimiser()

    optimiser.max_comm_size = max_comm_size

    if (not seed is None):
        optimiser.set_rng_seed(seed)

    improvement = optimiser.optimise_partition_multiplex(partitions + [partition_interslice], n_iterations=n_iterations)

    # Transform results back into original form.
    membership = {(v[slice_attr], v[vertex_id_attr]): m for v, m in zip(G.vs, partitions[0].membership)}

    membership_time_slices = []
    for slice_idx, H in enumerate(graphs):
        membership_slice = [membership[(slice_idx, v[vertex_id_attr])] for v in H.vs]
        membership_time_slices.append(list(membership_slice))
    return membership_time_slices
mb3152 commented 3 years ago

Can you explain the usage of CPMVertexPartition with a resolution parameter of zero? Does this partition solution matter to the final partition?

mb3152 commented 3 years ago

Also, I am running this on a 400 node network, density is 1 percent, with 900 layers. Any idea how long this should take? I submitted Friday, still running this weekend. I assume that's a bit longer than expected. Am I doing it wrong?

vtraag commented 3 years ago

Can you explain the usage of CPMVertexPartition with a resolution parameter of zero? Does this partition solution matter to the final partition?

This corresponds to the interslice coupling. There is no cost associated (i.e. no penalty for bringing nodes together in the same community), hence the resolution parameter should be zero. In principle one could also use a RBERVertexPartition or a RBConfigurationVertexPartition, as long as the resolution parameter is zero. This corresponds to the d_ij C_jsr in Eq. (3) in Mucha et al. (2020)

Also, I am running this on a 400 node network, density is 1 percent, with 900 layers. Any idea how long this should take? I submitted Friday, still running this weekend. I assume that's a bit longer than expected. Am I doing it wrong?

Note that the networks grow quite big very fast. With 900 slices of 400 nodes, you are running the algorithm on 900 layers of 900*400 = 360 000 nodes. Nonetheless, the current implementation is less efficient for many layers. There have been some attempts at making this faster for the earlier Louvain implementation, see https://github.com/vtraag/louvain-igraph/pull/34, but this was only implemented for modularity and did not generalize easily over all supported quality functions with substantial effort. Nonetheless, you may be interested in checking this out. You could ask @wweir827 or @ragibson if they perhaps have an updated version of their improvement.