Incremental interpretation of Relay graph

ramwan commented 5 years ago

Eventually we'll need to be able to monitor what's happening on the network in the real world and will need to modify our model. We also need to be able to make incremental changes to an abstract model which then gets interpreted to make changes to the underlying network.

CMCDragonkai commented 5 years ago

To reiterate, the graph inside the Relay code is an "abstract" graph. Sort of like document object model that web browsers have.

Similar to the DOM of browers, as you modify the DOM, the browser will automatically re-render the page accordingly. It's a living data structure.

So our relay graph has to be similar capabilities. As we modify the relay abstract graph, this needs to be interpreted as changes to the concrete network graph between automatons (network namespaces, veth pairs... etc).

In a simple sense this interpreter is just like the recursion scheme interpreters using the Fix pattern. And while that works nicely for being able to decompose the interpretation into different functions and case matches, the graph libraries of Haskell offer a similar thing via the fold functions. The cata from Data.Fix is just a generalisation of fold.

However we need to go a step further. We will need incremental interpretation such that we are not attempting to interpret the entire graph each time a minor change was made. If a minor change is made, only the equivalent minor change to the concrete networking graph needs to be made.

I'll post resources to self adjusting computations and incremental computation here to investigate.

CMCDragonkai commented 5 years ago

Another good example of this is the browser domain object model. Consider that when you open up your inspect panel on Chrome. You can right click some element on the page and then delete that element. The DOM is "live" graph, as it is capable of being changed driven by the programming logic, but also be the user-action. Also a live graph, changes to nodes in the graph propagates changes to other nodes in the graph in a cascading fashion. However these changes are "minimal" in the sense that they don't affect the entire graph. Imagine wanting to change 1 node in the graph and having to replace the entire graph with the new graph with that node. This is obviously not what's happening with the browser DOM.

We want the Matrix graph to be a series of augmentations and events on the original Architect language expression. Unlike compilers we don't take high level code and remove information until we are left with just a pure computational expression that is both minimal and focused purely on performance. Instead of doing this, we instead "augment" our original minimal Architect expression with extra information. Each step can add different kinds of information. For example consider that at the high level, we never state how many instances an Automaton has, this is because should be derived automatically. So how is this augmented information derived? It should be derived from defaults and from the environment. You can think of the n number of instances to be a variable defined by an equation. This equation can default on something sensible (0 or 1 depending on how lazy we want to be). The final interpretation of this graph has to result in some base-case. We have to choose our base case which lives in the IO context. The base case here could be TCP/IP, but we can also have many different kinds of base cases. Events that occur in the IO context are then propagated backwards through the graph, and then they drive changes to the information that exists in the intermediate steps. They cascade backwards until we reach back the Architect expression (which most likely won't change at all or change very little because it is so high level and minimal). If these changes cause a change to n, this in turn drives changes to other parts of the graph such that Automaton instances may be upscaled or downscaled. The whole thing is recursive.

Right now we are using a graph library that isn't designed to be a "live" incrementally computed graph. However this is a start, and we will likely upgrade this. Further research required here. Augmentation has another important advantage, the augmented graphs shared their data with the original graph. So we are not creating completely separate graph data structures. Structure sharing in purely functional languages end up doing this for many of its data structures.

https://en.wikipedia.org/wiki/Zipper_(data_structure)
Scene Graph in Video Games
VirtualDOM
The Architect compiler based on the hnix compiler uses recursion schemes and functors to create augmented expression trees.
Purely functional data structures (persistent data structures) tries to do this a lot with pointers but there are other methods as well.

So our Matrix Graph is the DOM for infrastructure.

CMCDragonkai commented 5 years ago

@n-zhang-hp Since you're working on Emergence container runtime, I suggest you have a look at this as well, as the configuration graph for our containers and artifacts need to deal with a similar problem.

CMCDragonkai commented 5 years ago

Subsequently in the future we have to consider how to upgrade this system to be distributed over many orchestrator nodes and also persistent against failure, be highly available and highly consistent.

CMCDragonkai commented 5 years ago

As for implementing the base case, you've done this mostly with namespaces and veth pairs (and calls to libmnl/netlink sockets). I suggest going through VPN implementations to see how they've done it as they often have lots of work done for performance and security. @ramwan

CMCDragonkai commented 5 years ago

Lens and Zippers:

CMCDragonkai commented 5 years ago

http://www.frankmcsherry.org/differential-dataflow/ - incremental computation for dataflow networks

MatrixAI / Relay

Incremental interpretation of Relay graph #21