Closed liamhuber closed 9 months ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Sorry I didn't manage to do it today (same applies to other PR that I couldn't review). I'll try to do it tomorrow
I will finish this tomorrow morning, should be ready before your work day starts!
I will finish this tomorrow morning, should be ready before your work day starts!
@srmnitc no worries. Actually, reviewing the code itself is the least of my reason's for @'ing you.
What I'm moving towards here is a principle that for graph based workflows the semantic path (i.e. chain of node names), filesystem path (i.e. working directories), and storage path (in __getstate__
, inside HDF5, etc.). I was curious if this already aligns with your thinking or if you had feedback.
On the one hand it feels like a trivial idea, but actually at least in the case of __getstate__
it really requires intentionality or you get extra container terms wedged in between nodes.
Abstractly, we can think about the "semantic path" of a node as the chain of node labels from the graph root down to that node, e.g.
"some_workflow/some_macro/some_sub_macro/a_macro_child_node"
.Composite
nodes hold their children in a (dot)dictionary.nodes
, but use__getattr__
so this semantic path can be used with live objects, e.g.some_workflow.some_macro.some_sub_macro.a_macro_child_node
.Naively storing a
Composite
's state, the path in the state winds up looking instead like"some_workflow/nodes/some_macro/nodes/some_sub_macro/nodes/a_macro_child_node"
.Here, we modify
__getstate__
and__setstate__
to de(re)constuct thenodes
dictionary so the state path matches the semantic path.This is safe since children of
Composite
are already forbidden from having a label that conflicts with any of its parent's attributes, so the space in the state dictionary can't possibly conflict with anything.Since we use
__get/setstate__
for file storage with HDF5, this means the storage path will also match the semantic path (or will match it after downstream PRs in the case of the"tinybase"
storage mode, which isn't yet using the state).