GiulioRossetti / dynetx

Dynamic Network Analysis library
http://dynetx.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
104 stars 34 forks source link

Saving dynamic graph in a format for plotting and viewing in software such as Gephi and community detection #78

Open amjass12 opened 4 years ago

amjass12 commented 4 years ago

Hi!

Thank you for this great package, it is really to use and works for my temporal data. I have a few questions:

I have 7 graphs which each represent a time point. Many nodes in each graph are common.. but may lose connections with nodes/gain connections with other nodes as time progresses.

As such I would like to be able to visualise this in a software package such as a gephi. The code i use to generate this dynamic network is:

import dynetx as dn

networklistOne = [graph1, graph2 etc...]
dynamic_graph = dn.DynGraph(edge_removal=True)
for t, graph in enumerate(networklistOne):
    dynamic_graph.add_interactions_from(graph.edges(data=True), t=t)

My questions are:

is there a way by which this can be saved in a format that can be understood by gephi? it is slightly different to networkx so I am not sure about this.

The total sum of the nodes in the dynamic graph is not equal to the collective number of nodes in each network. I am guessing this because these are common nodes? and that the idea here is that we want to understand which node networks are added/removed over time?

Community detection, is there a way of carrying out community detection in the dynamic graphs?

thank you in advance!

GiulioRossetti commented 4 years ago

Hi, unfortunately, at the moment it is not present a way to read/write dynamic graphml files (the ones that, if I remember correctly, are handled by Gephi). I'll add it to the feature request list.

You are right, the total number of nodes in a dynamic graph is not the mere sum of nodes in each snapshot: nodes and edges may join/leave the network as time goes by, so it is likely that part (or most of them, depending by the observed phenomenon) of them appear multiple times in the network history.

Regarding community discovery on dynamic networks: there are several approaches to address such a problem (here a comprehensive survey). There are a few implementations of DCD approaches out there (e.g., Tiles or even in the tnetwork library)): however, we are not planning to add them to dynetx since we are also involved in a broader project (CDlib) whose aim is to make available community discovery algorithms implementations in a structured way. We have already planned the extension of such a library to support dynamic community discovery in a forthcoming release.

Best, Giulio

amjass12 commented 4 years ago

Hi @GiulioRossetti ,

thank you so much for your detailed response and for adding this as a feature request, this package is incredibly powerful in telling me how the graph is changing over time! it would be great to be able to visualise this dynamic.

Thank you for confirming re: node numbers. Thank you also for the community detection information, i will wait for CDlib implementation, is there a rough ETA for this release? I have found in the meantime that carrying out a best partition detection on the individual graphs at individual time points is sufficient.

thanks!

GiulioRossetti commented 4 years ago

Hi, In CDlib we already have a more or less stable backbone for dynamic community representation: the release with DCD methods mostly depends on how complex will be integrating existing methods.

Unfortunately, very few of them are coded in python and all of them model dynamic graphs in an adhoc manner. Hopefully, within the end of the year we'll have a few methods to justify a merge of DCD in the master branch.

amjass12 commented 4 years ago

Great! thank you!.. sorry i have one more question! its actually something i have noticed from my temporal dynamic which i am not sure if it is a bug? (sorry just to clarify)

I notice that components that are not present in graphs in some time points are not actually shown as being lost. So for example, i have an node 'x' that is present with a group of another 80 nodes or so at time point 1. This node is no longer present as a connection with any other nodes at time point 2. (I have verified that in time point 2, this node is indeed absent as it does not form any connections under the criteria i have set out (pearson correlation etc)..

When running

for e in g.stream_interactions():
        print e

and saving, I see for example that node x is present at time point 1 as a '+', but that at timepoint 2 it is absent, but does not show that it has disappeared from the nodes it was present with at timepoint 1, in other words no '-'.

Additionally, I have noticed that nodes that show a '-' are all the same:


ENSMUSG00000026691 | ENSMUSG00000026691 | - | 3
ENSMUSG00000062510 | ENSMUSG00000062510 | - | 3
ENSMUSG00000058740 | ENSMUSG00000058740 | - | 3

have I misunderstood something? I imagine that nodes that are lost for example, would be lost at time 'x' but would be lost from different nodes they were connected with at a previous time point?

so it might read for example as ENSMUSG--1| ENSMUSG...2 '-' (if lost at a transition).

please see complete code: maybe I have made a mistake?

networklistTemporal = [networkGraph14, networkGraph12, networkGraph15, networkGraph17, networkGraph13, networkGraph16, networkGraph18]

dynamic_graph = dn.DynGraph()
for t, graph in enumerate(networklistTemporal):
    dynamic_graph.add_interactions_from(graph.edges(data=True), t=t)

interactions = []
for e in dynamic_graph.stream_interactions():
    #print(e)
    interactions.append(e)
df = pd.DataFrame(interactions)
df.to_csv('interactions.csv')

thank you for your time and sorry for the additional question

edit: I have noticed the number of timepoints in the output of dynamic_graph.stream_interactions() is 8 (0-7) as opposed to 7 (0-6):

When I run dynamic_graph.temporal_snapshots_ids()

It does show the correct number of time points:

dynamic_graph.temporal_snapshots_ids() [0, 1, 2, 3, 4, 5, 6]

dayanandv commented 3 years ago

Hi @amjass12,

If you can compromise on not having the exact format of Gephi dynamic graphs (with Timeline), you may stream the graph to Gephi. Some time back, I had modified and used pygephi which uses Gephi's Graph Streaming plugin. The plugin runs a server on Gephi (default port number 8082) that receives and processes graph updates sent as JSON. I had written a connector class around it to interface my application (which uses DyNetx) with Gephi. Attaching the connector class file, the modified py3gephi, a toy example, and a gif of how it appears on Gephi. Gephi_Connector_Files.zip

I can't guarantee the correctness or efficiency of the approach, consider it as a workaround. Hope this helps.