vtraag / leidenalg

Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.
GNU General Public License v3.0
616 stars 78 forks source link

Multiplex and temporal network #14

Closed Daimakaimura closed 4 years ago

Daimakaimura commented 5 years ago

Hi Vincent,

I have been playing around for a bit with the package and I have a question. How to manage when you have both a temporal network and a multiplex network. By this I mean I have a number of temporal slices and each slice consists in different aspects of the network. Going back to the example in the documentation, how will you proceed if you have email and telephone graphs of users at different moments in time. I have been going through the documentation but I am not able to find anything close to this case. Is there a way for LeidenAlg to handle this situation?

Thanks for the awesome code!

vtraag commented 5 years ago

You should be able to use the function time_slices_to_layers repeatedly on the various multiplex networks, and then make one large list of the partitions.

For example, if we have three time slices of telephone calls, G_tel_1, G_tel_2 and G_tel_3, and three time slices of email, G_mail_1, G_mail_2 and G_mail_3, you can make the necessary partitions using

layers_tel, interslice_layer, G_full_tel = la.time_slices_to_layers([G_tel_1, G_tel_2, G_tel_3], interslice_weight=0.1);
layers_mail, interslice_layer, G_full_mail = la.time_slices_to_layers([G_mail_1, G_mail_2, G_mail_3], interslice_weight=0.1);

and then follow the remainder of the last example in the documentation on temporal community detection and create partitions for all layers (i.e. both layers_tel and layers_mail). The interslice layer is created twice, and you can simply disregard one, as they provide identical information.

The only thing is that you have to make sure that all nodes of G_tel_1 are identical to the nodes in G_mail_1 (in terms of that node 0 in G_tel_1 refers to the same node as node 0 in G_mail_1), and also for G_tel_2 and G_mail_2, et cetera.

If you run into issues, let me know!

Daimakaimura commented 5 years ago

Thank you for the explanation. I have implemented and runs with no errors. However I am still rather lost as I am not sure how to interpret the 12 partitions I get at the end (from 2 multiplex and 6 time steps/slices). I would expect to end up with 6 membership vectors (1 per time step/slice) so I know to which community each node belongs at every time point. In my first approach I was using find_partition_multiplex() with two graphs (emails and phones) to get the memberships at each point in time. This worked well, however my problem is that the community IDs do not match, ie community 3 in time step 0 != community 3 in time step 1. That was the reason I wanted to use find_partition_temporal() but obviously I am missing something.

Daimakaimura commented 5 years ago

Just checked the content of the 12 partitions I get in the end and it is the same memberhsip.

vtraag commented 5 years ago

I think there is some confusion around how the multiplex community detection works.

The graphs that are passed to find_partition_multiplex should all have an identical vertex set. Each node in each graph that is passed to find_partition_multiplex is assigned to the same cluster. In other words, the clustering is identical across the different layers.

Similarly, all partitions that are passed to optimise_partition_multiplex are assumed to have an identical vertex set. The membership of all partitions that are passed are again identical.

In order to deal with various time slices in which the "same node" in a different time slice may be assigned to different clusters, we create one large network, which embeds the first time slice and the second time slice, and also the interlisce links between the two slices. This is explained in the section slices to layers in the documentation. The membership vector of any partition will hence contain the membership of all nodes in all time slices. See also the documentation for slices_to_layers for more details.