Closed jim22k closed 4 years ago
A variant of Approach 2 would be to require a node to have a numeric ID (like a label) which does not need to be consecutive. This would be more efficient to implement on accelerators, and could still be auto-assigned when a graph is created.
As an example, suppose we create a Graph
(using fake graphviz notation here) from this:
a->b
b->c
b->d
we would want to have a graph with the following node properties:
node_offset | node_id | node_label |
---|---|---|
0 | 0 | "a" |
1 | 1 | "b" |
2 | 2 | "c" |
3 | 3 | "d" |
Then if we run a filter operation and select all nodes but "c", we would get the new graph: | node_offset | node_id | node_label |
---|---|---|---|
0 | 0 | "a" | |
1 | 1 | "b" | |
2 | 3 | "d" |
Which would ensure we could connect node "d" between the two graphs without having to do a string comparison.
Before we dig deeply into the base types, let's define what a node is.
networkx
lets a node be any hashable object. Each node must be unique within a graph. Let's call this Hashable.graphblas
requires nodes to be sequential uints starting from 0. Let's call this Sequential.These seem to be the most common definitions of nodes used by various libraries. Sometimes there will be a concept of a node label.
igraph
uses Sequential nodes, but lets you add a "label" attribute to each node, which needs to be a string. Most methods and algorithms which expect a node let you pass in either the uint or the string and it figures things out.I see two approaches to handling the difference in node expectations as we convert between different library representations of Graph and Nodes.
Approach 1
Require all nodes to be Sequential. When this is not a natural constraint in a library (like networkx), we create a copy and assign each node a sequential number in init. We still keep the mapping around for reference, but that mapping is not passed around.
Approach 2
Give every node a label. For graphs with Sequential nodes, the uint number is the label. Maintain a mapping between node label and node position and pass this mapping around. This allows each backend library to use either the label or position as appropriate for their implementation. Note: this is our current approach with
node_index