Closed jim22k closed 4 years ago
Things brought up in the 2020-05-08 meeting:
Without going into full detail, here is a summary of what Jim and I discussed.
I like having the primary objects:
NodeSet
, NodeMap
, NodeTable
EdgeSet
, EdgeMap
, EdgeTable
A single node is generally given by its integer id, and a single edge is generally given by a a tuple of nodes.
For calls that return a single node or edge, we can create lightweight Node and Edge objects.
I think we should have NodeIndex
that can map between node ids and any hashable Python object (and reverse). We should handle these conversions in metagraph--don`t require backends to do this (or maybe make it trivial to do).
All graph-like things can have an optional NodeIndex
. This lets single nodes and edges to be referred to by any hashable Python objects excluding ints.
I think datatypes should have properties such as positive
and nonzero
.
Values from {Node,Edge}Map
is simply e.g. a Python scalar.
Values from {Node,Edge}Table
is a Python dict.
This gives us the possibility of composing a more rich Graph
object that is comprised of a Node{Set,Map,Table}
and Edge{Set,Map,Table}
and an optional NodeIndex
. We can then have edges_type
and nodes_type
as a property if we need to distinguish between these. This could be especially handy when converting to/from external graph objects.
Similarly, in the future we could introduce rich BipartiteGraph
, DynamicGraph
(and DyanmicEdge*
), HyperGraph
, MultiGraph
, etc. objects
Question: can node
or node_id
be a dtype?
Finally, DataFrame
, Vector
, and Matrix
are more-or-less unchanged and can evolve as necessary.
- not really a type, more of a concept
- can be used to label nodes using strings
- no missing values allowed
- O(1) lookup by position
- iteration order guaranteed
- missing values allowed
- O(1) lookup by position
- O(1) test for emptiness by position
- no missing values allowed
- O(1) lookup by position
- missing values allowed
- O(1) lookup by position
- O(1) test for emptiness by position
- each column has a unique string label
- each column has a single dtype
- O(1) indication of node inclusion
num_nodes
,__contains__
- O(1) lookup by node
- single dtype
- no missing values
num_nodes
,__getitem__
,__contains__
- each property has a unique string label
- values are allowed to be empty
num_nodes
,num_properties
- O(1) lookup for node inclusion
- O(1) lookup for edge value
- O(1) lookup for edge presence
num_nodes
,num_edges
- single dtype for edge values
- edges are undirected
num_A_nodes
,num_B_nodes
,num_edges
- each graph has a string label associated with the property
num_nodes
,num_properties
Additional things to consider
Which types require a Wrapper? Probably all the Nodes and Graph variants
Should we formalize the hierarchy of abstract properties?