cytoscape / ipycytoscape

A Cytoscape Jupyter widget
https://ipycytoscape.readthedocs.io/en/master/
BSD 3-Clause "New" or "Revised" License
269 stars 62 forks source link

Bug: Multiple edges between same nodes not shown #125

Closed Shivam-Miglani closed 4 years ago

Shivam-Miglani commented 4 years ago

For a graph with edges A,B, weight=15 B,A, weight=10

Only the first edge object is retained in the add_edges(self, edges, directed=False) method, which is wrong behaviour.

This add_edges method is used in add_graph_from_networkx method, which leads to deletion of certain edges from networkx graph.

image

Shivam-Miglani commented 4 years ago

Setting directed = True exhibits the same (wrong) behavior.

ianhi commented 4 years ago

Good catch!

This looks to be due to this line: https://github.com/QuantStack/ipycytoscape/blob/72b73c6a8bd4d352bfcf97c71a24036b1524573e/ipycytoscape/cytoscape.py#L273-L274

This was added to prevent accidental duplication of edges but it has the unfortunate consequence of preventing intentionally having multiple edges between nodes. I'm not sure what the best way to deal with this is, @MridulS what do you think?

Copy pasteable example:

import networkx as nx
import ipycytoscape as ipc
G = nx.DiGraph()
G.add_edge(1,2,weight=15)
G.add_edge(2,1,weight=10)
cyto = ipc.CytoscapeWidget()
cyto.graph.add_graph_from_networkx(G)
print(G.edges(data=True))
print(cyto.graph.edges)
display(cyto)
ianhi commented 4 years ago

I guess the issue with duplication is really that we shouldn't duplicate nodes if the nodes already exist, but we should allow multiple edges between two nodes even if the edges have all the same properties.

MridulS commented 4 years ago

Ahhhhh, my bad. I thought I fixed it. Thanks for the catch @Shivam-Miglani!

https://github.com/QuantStack/ipycytoscape/pull/126 should fix it.

and yes you would need to pass it directed=True currently while creating the Cytoscape graph object as we don't infer the type automatically. (maybe we should?)

The following code should work.

import networkx as nx
import ipycytoscape as ipc
G = nx.DiGraph()
G.add_edge(1,2,weight=15)
G.add_edge(2,1,weight=10)
cyto = ipc.CytoscapeWidget()
cyto.graph.add_graph_from_networkx(G, directed=True)
print(G.edges(data=True))
print(cyto.graph.edges)
display(cyto)
ianhi commented 4 years ago

@MridulS I'm not sure that totally solves the problem in it's entirety because it does not allow multiple non-directed edges between nodes.

I think there may also be a philosophical question here about what the result of:

cytoscapeobj.graph.add_graph_from_networkx(nx.complete_graph(5))
cytoscapeobj.graph.add_graph_from_networkx(nx.complete_graph(3))

should be.

It could be 1: New edges but not new nodes image 2: no duplication image

3: replace existing graph image

my sense is that the result should be the first option but curious what others think.

MridulS commented 4 years ago

IMO it should be 3, and that's why I raised the idea of having a clear graph method whenever add_graph_from_networkx is called. This is an API design decision if add_graph_from_networkx is supposed to be used successively or not, I feel we should relegate the graph manipulation to functions like add_node, add_edge, we shouldn't use add_graph_from_networkx to manipulate the graph object. But again this is just my opinion so ¯\(ツ)

Figure 1 becomes a MultiGraph it's not a Graph anymore, Figure 2 is what is happening right now.

And currently the way code is structured, ipycytoscape doesn't support MultiGraphs, only Graph and Directed Graphs, we should definitely add support for MultiGraphs too, and be more explicit about it.

NOTE: I use the terms MultiGraph and Graph the way NetworkX defines it.

ianhi commented 4 years ago

Figure 1 becomes a MultiGraph it's not a Graph anymore, Figure 2 is what is happening right now.

That's a very good point - and figure 2 is what neither of us would expect :joy: so that's not optimal.

I feel we should relegate the graph manipulation to functions like add_node, add_edge

I'm wary of that because there is currently no easy way to pass those functions a networkx object, you have to first create an ipycytoscape.Node.

This is an API design decision if add_graph_from_networkx is supposed to be used successively or not

To me the add prefix implies that I should be able to use the function successively. Being able to call add_node successively but not add_from_networkx successively seems inconsistent. Maybe what we need is a function to do both? As in: set_.... and add_...

And currently the way code is structured, ipycytoscape doesn't support MultiGraphs, only Graph and Directed Graphs, we should definitely add support for MultiGraphs too, and be more explicit about it.

Strong agree :+1:. Figuring out how to do this would probably also guide how to resolve how the various add functions should work. Initial proposal: The cytoscape.graph object be a multigraph that allows mixed edges and then have add_edges and add_nodes take arguments that signal whether the graph should be added as a multigraph or just a graph, or override etc...

ianhi commented 4 years ago

The undirected graph in the examples may be a good test case here. I think being able to add a second disconnected graph by way of networkx without having to separately create ipycytoscape.Nodes is important to be able to do. Though the api of using add_from_networkx doesn't necessarily need to be the way to accomplish that. image

MridulS commented 4 years ago

I think being able to add a second disconnected graph by way of networkx without having to separately create ipycytoscape.Nodes is important to be able to do.

We don't really need to do that. We just need to make the second component have different underlying nodes, but by default all the graph generators in networkx will result in nodes like [0, 1, 2, 3, 4, ....]. All of our examples use networkx generators but people would be using their own data and if the networkx object they have created have the different components it will work with ipycytoscape. We shouldn't change the way nodes are presented while moving from networkx to ipycytoscape.

The following code is perfectly valid, the user just has to make sure the ipycystoscape graph is a valid networkx graph.

G = nx.complete_graph(30)
H = nx.subgraph(G, nbunch=[1, 2, 3, 4, 5])  # extract a 5 node complete graph
I = nx.subgraph(G, nbunch=[8, 9,10])  # extract a 3 node complete graph with nodes 8, 9 and 10 
J = nx.subgraph(G, nbunch=[20,21]) # extract a 2 node complete graph with nodes 20 and 21
undirected = ipycytoscape.CytoscapeWidget() 
undirected.graph.add_graph_from_networkx(H)
undirected.graph.add_graph_from_networkx(I)
undirected.graph.add_graph_from_networkx(J)
ianhi commented 4 years ago

@MridulS I don't think I fully understand sorry. Could you post what you would expect the the outcome of your latest example to be?

MridulS commented 4 years ago

This is what I expect and what I get using the current master branch of ipycytoscape , successive add_graph_from_networkx() work here without any issues as there are no repeated nodes. The checks in add_nodes/add_edges will only come into picture when the user tries to add a duplicate node/edge in the graph.

Screenshot 2020-07-20 at 23 35 56
ianhi commented 4 years ago

Got it thanks! This is also what I would expect and what I was trying to convey with:

"The undirected graph in the examples may be a good test case here. I think being able to add a second disconnected graph by way of networkx without having to separately create ipycytoscape.Nodes is important to be able to do"

I agree that duplicate nodes should not be added by successive calls. I was just also considering if adding the same graph over and over again should allow for more edges to be added. Though on further reflection with y'all's examples I agree that this seems less principled.

As for the types of graphs supported by ipycytoscape what do you think of the idea that ipycytoscape's internal model would be a MultiMixedGraph? This would seem to allow the most flexibility in representation.

And for this example what result would you expect:

G = nx.DiGraph()
G.add_node(1)
G.add_node(2)
G.add_edge(1,2,weight=15)
G.add_edge(2,1,weight=10)

H = nx.MultiGraph()
H.add_node('a')
H.add_node('b')
H.add_edge('a', 'b')
H.add_edge('a', 'b')
H.add_edge('a', 'b')

mixed = ipycytoscape.CytoscapeWidget() 
mixed.graph.add_graph_from_networkx(G, directed=True)
mixed.graph.add_graph_from_networkx(H)
mixed

I would expect: image

ianhi commented 4 years ago

Another thing to consider is how this will work with the from from_json and from_df methods. Maybe instead of checking if directed was passed to add_edges it should have an argument like allow_duplicates (allow_parallel_edges?) that defaults to false but could be set as appropriate by the various from_* methods.

marimeireles commented 4 years ago

My 10cents in this discussion is that I agree with both of you and we should think on expanding to have multigraphs and so on. I didn't think of these ideas when we were talking before about adding this clean graph method @MridulS but yeah, it makes total sense to have this division and the things you said here https://github.com/QuantStack/ipycytoscape/issues/125#issuecomment-660520508 I think it's a good idea to follow networkx in this sense.

ianhi commented 4 years ago

Something that would be very helpful would be to define many possible situations, agree on what we think should be expected from them, and then figure out how to make that happen rather than make piecemeal changes. There are some examples in this thread already but I think there are a few more to include (e..g json) and we should put them all into one comment in a table or something

MridulS commented 4 years ago

Maybe we should add directed and multiple_edges as function arguments in add_* function signatures. This way we can support MultiGraphs with mixed edges, (directed and undirected). If we infer types directly from the networkx object we will lose out on mixed edges as networkx doesn't support that, IMO ipycytoscape shouldn't be limited to that. (I like seeing directed and undirected edges in my graphs :) )

Let me know if the following proposed API makes sense

mixed = ipycytoscape.CytoscapeWidget() 
mixed.graph.add_graph_from_networkx(G, directed=True)
mixed.graph.add_graph_from_networkx(H, multiple_edges=True, directed=True)
mixed.graph.add_graph_from_networkx(I) # defaults to False for both arguments.

this way we can easily translate the API to json and dataframe input format too while maintaining the same function signatures essentially.

mixed.graph.add_graph_from_json(json_file, directed=True)

@marimeireles @ianhi let me know what you folks think? :)

MridulS commented 4 years ago

Let me know if https://github.com/QuantStack/ipycytoscape/pull/126 is mergeable. Once that is done I can get the MulitGraph support in.

ianhi commented 4 years ago

@MridulS some experiments really took over my life for the last few weeks sorry about that!

If we infer types directly from the networkx object we will lose out on mixed edges as networkx doesn't support that, IMO ipycytoscape shouldn't be limited to that.

Agree about not limiting but can we use inference to get a benefit of making it easier sometimes without limiting what is possible? I.e. if I pass a networkx multigraph can it then default to multiple_edges=True but have all those edges be undirected, and a MultiDiGraph would automatically set both flags to true?

I think those are both sensible arguments to add. Would you like to adapt 126 to add them as well or should we add them separately?

MridulS commented 4 years ago

Initial work on support for multiple edges https://github.com/QuantStack/ipycytoscape/pull/146

ianhi commented 4 years ago

Thanks to @MridulS for fixing this! Fixes for this will be included in any release >1.0.3