paulbrodersen / netgraph

Publication-quality network visualisations in python
GNU General Public License v3.0
660 stars 39 forks source link

ArcDiagram node_order layout param not being applied #47

Closed arpieb closed 10 months ago

arpieb commented 2 years ago

Hi there! Very nice library, trying to leverage the ArcDiagram plot and running into an issue. I'm creating a graph where the nodes represent percentages (string labels '0' to '100') and the edges are all from '0' to whatever the percentage is for the datapoint (e.g. '55'). I would like to render the arc diagram with the nodes arranged 0-100 in a linear fashion, and am passing in the ordered list of nodes to the ArcDiagram constructor's node_order param as ['0', '1', '2',...'99', '100'] but the graph rendering appears to be ignoring this.

Am I using node_order incorrectly or is it not being applied for some reason?

Thanks!

arpieb commented 2 years ago

Minimal example:

import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
from netgraph import ArcDiagram

G = nx.Graph()

nodes = [str(x) for x in range(101)]
G.add_nodes_from(nodes)

rng = np.random.default_rng()
for pct in rng.integers(100, size=150):
    G.add_edge('0', str(pct))

ArcDiagram(G, node_order=nodes, node_size=1)
plt.show()

image

paulbrodersen commented 2 years ago

Hi, thanks for raising the issue.

The node_order parameter is not being ignored. The issue is that netgraph computes the layout for each connected component separately. As a result, the nodes within each component are ordered correctly; however, your unconnected nodes are placed in "incorrect" places.

Figure_1

#!/usr/bin/env python
"""
Issue no. 47: node_order layout param not being applied
"""

import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
from netgraph import ArcDiagram

G = nx.Graph()

nodes = [str(x) for x in range(11)]
G.add_nodes_from(nodes)

rng = np.random.default_rng()
for pct in rng.integers(10, size=15):
    G.add_edge('0', str(pct))

ArcDiagram(G, node_labels=True, node_order=nodes)
plt.show()

Computing the node layout for each component separately makes a lot of sense in most cases, as node layout algorithms are typically only defined for connected components. Your case, i.e. an arc diagram with a pre-determined node order is one of the two exceptions in the library (the other being a circular layout with pre-determined node order). So I potentially/probably won't fix this behaviour as it would increase the complexity of my code substantially for fairly little gain.

However, there is an easy work-around to your problem: invisible edges from the largest connected component to your other components.

Figure_2

#!/usr/bin/env python
"""
Issue no. 47: node_order layout param not being applied
"""

import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
from netgraph import ArcDiagram

G = nx.Graph()

nodes = [str(x) for x in range(11)]
G.add_nodes_from(nodes)

rng = np.random.default_rng()
for pct in rng.integers(10, size=15):
    G.add_edge('0', str(pct))

edge_color = dict()
for edge in G.edges:
    edge_color[edge] = 'black'

# add invisible edges to all nodes that are not in the largest connected component
largest_component = next(nx.connected_components(G)) # the largest CC is always the first
largest_component_node = list(largest_component)[0]
for node in nodes:
    if node not in largest_component:
        G.add_edge(largest_component_node, node)
        edge_color[(largest_component_node, node)] = 'white'

# give white edges a low zorder so they don't leave white stripes when crossing black edges
edge_zorder = {edge : 1  if color == 'black' else -1 for edge, color in edge_color.items()}

ArcDiagram(G, node_labels=True, node_order=nodes, edge_color=edge_color, edge_zorder=edge_zorder)
plt.show()
arpieb commented 2 years ago

Got it, than you very much for the detailed explanation and example!

paulbrodersen commented 1 year ago

I am re-opening the issue because it is probably worth fixing the underlying issue despite the increase in code complexity and the existence of a workaround.

paulbrodersen commented 10 months ago

This issue should now be fixed on the dev branch, and will be available on the main branch / from PIPy / from conda-forge with the next major release (>5.0.0).