Relay/Orchestrator Roadmap: Content addressing of containers and environments

MatrixAI / Relay

Service-Centric Networking for Matrix Automatons

0 stars 0 forks source link

Relay/Orchestrator Roadmap: Content addressing of containers and environments #2

Open ghost opened 6 years ago

ghost commented 6 years ago

The key concerns of Relay and Orchestrator are in providing a network and deployment service at the level of automatons (where an automaton can be thought of as a container + associated container environment if any).

This involves implementing some of the mechanisms in the literature on P2P Networks and service based networks, such as service and peer discovery, establishing seperate control and data planes between nodes in the network, and ensuring that network communication between two automatons is transparent to each service running inside the automatons.

We also introduce a content addressing system for containers and container environments, processes, network environments, and hardware that our automatons may run on. The idea here is that we can use content-based addresses to precisely indicate to the orchestrator what container should be deployed on what machine, as well as also the required environment for the container to run.

Optimistically, already existing p2p frameworks such as LibP2P will allow us to avoid reimplementing the necessary discovery mechanisms and transports for our automaton network.

So the (draft) roadmap looks like this:

Define a minimal API that Relay exposes to the other Matrix subsystems. It should allow us to query for nodes and their associated content addresses, and containers that are running on the nodes, as well as associated network configuration. Perhaps relay will also expose a control surface that allows us to manipulate the configuration of each node. Furthermore, we should have some notion of installing a container on a node, or a network that allows us to deploy specific configurations (assuming that NixOS is the target OS). 1.1 Start off with integrating with either LibP2P or doing a small peer discovery module over the regular transport.
Investigate the need for a service abstraction layer to facilitate migration of flows between front facing automatons as well as automatons that are internal to the network.
Write a mini orchestrator that either is part of Relay, or interfaces with relay to deploy a small container application to multiple node based on the content address of the node; if necessary, it may spin up a NixOS image and bring it to the necessary environmental configuration to install the container.
Determine how network communication inside the Matrix network occurs. When we want to implement automaton dependencies, we may want to specify exactly how two automatons communicate (this is selection of transport between two automatons).
Produce a small example of all the main features we want to support:
- [ ] Content Addressed Environments
- [ ] An API that integrates with the container info collection on the Architect side to query a node for performance information
- [ ] Programmatic deployment of containers and environment to specific nodes
- [ ] Spinning up new nodes with NixOS
- [ ] Automatically configure NixOS environment on demand for a container deploy
- [ ] Data flow between peer nodes that is direct, not forwarded
Integration with other subsystems so that we can automatically deploy a container and environment configuration through the API (so that the Architect interpreter can determine where to deploy such container based on the QOS constraints).

Most of the work here seems to be in implementing the discovery module, and ensuring that we have a way to bring up the necessary environmental configuration for the container automatically using NixOS. Most of what features we'll need will probably come out of how this interacts with the other subsystems. I'll probably flesh out this roadmap a bit more as I work on these areas.

CMCDragonkai commented 6 years ago

Example of passing addresses:

// A

const http = require('http');

// this is not the way we should be doing this
// const matrix = require('matrix');

const server = http.createServer((req, res) => {

  if (req.... === 'blah') {

     // this is ok if it can be done as easily as as effiicnetly
     const result = http.get('dogs.com');
     // as this
     const result = http.get(env.BADDRESS);

     const result2 = ftp.get(env.CADDRESS);

     const result3 = unixdomain(env.DADDRESS);

     console.log(env.BADDRESS);

     res.write(result);

  }

  res.end();
});

server.on('clientError', (err, socket) => {
  socket.end('HTTP/1.1 400 Bad Request\r\n\r\n');
});

server.listen(env.APP_PORT);

This is the architect expression that somebody might write.

AutomatonA:
... ASD*(@#RJ#EOI$R)
... I expect BADDRESS that speaks RAML(HTTP): E*FDS(&(#*$U*$))
... I expect CADDRESS that speaks FTP
... I expect dogs.com to resolve tCAE(*UF(*SF&(S*DF)))
... I expose my service on port APP_PORT

AutomatonB:
... E*FDS(&(#*$U*$))

httpCombinator(
  AutomatonA,
  AutomatonB
)

// 'BADDRESS' = 'https://dsfosdifj9832134454-43'
// 'CADDRESS' = 'ftp://ofdgud98g9847edftgdg'
// DNS 8.8.8.8

So we have enumerated 3 ways of passing "content addresses" into the automaton. Using a custom matrix library is not a good way. Environment variables is possible is a good way IMO.

The content address is the address of the interface.

The best way to think about this is like OOP.

CMCDragonkai commented 6 years ago

Docker Weave (Weaveworks) - https://github.com/weaveworks/weave
Overlay Networks - https://kubernetes.io/docs/concepts/cluster-administration/networking/ and https://chrislovecnm.com/kubernetes/cni/choosing-a-cni-provider/
Flannel Network - https://github.com/coreos/flannel
https://en.wikipedia.org/wiki/Open_Shortest_Path_First#Implementations (See Quagga and GNU Zebra)

You need to look at the source code of http load balancers: haproxy.

IPv6 Anycast.

Mesh networking.

ghost commented 6 years ago

So I can look at implementing the function that takes a content address and converts it to the necessary http, ip, ftp address in the source. The actual conversion depends on the other functionalities i.e. the QOS (as it determines the machine that the get actually comes from). If we think of the function as only producing temporary addresses, and the communication happening in our network over some arbitrary type of data flow, then how to implement temporary addresses becomes relevant (like IPv6 Anycast). So my area of research could be in doing this for multiple different types of protocols?

ghost commented 6 years ago

@CMCDragonkai From our discussion, if we use a centralized orchestrator, we don't really need to do discovery. (A central node knows the location of every service). Mostly it remains to redirect connections to the correct location. low level API to use is nftables (packet routing using BPF bpfilter is not yet ready).

Since the minimal API stuff is sort of covered by the Haskell bindings that VIvian is writing, and the service abstraction layer is pretty minimal (really just mapping into nftables), I think I'll start work on part 3, the mini orchestrator. Probably to start off with, I see this as a daemon that runs on the machine, similar to running an ipfs daemon.

The API should look something like, suppose we have some abstraction for machine on the network. Then mini orchestrator knows how to setup .nix configuration on the machine (If the Node is running NixOS, then we can represent an automaton environment using per user profiles? Alternatively, one virtual machine per automaton.) I think just having an orchestrator that is able to deploy nix configurations to multiple machines is a step forward from the current (there are orchestrators like Puppet that already do this for their own configuration files, but not for nixOS unless I'm mistaken.)

Thoughts?

CMCDragonkai commented 6 years ago

Each node maintains synchronisation and knowledge about automatons. So "centralised" here doesn't mean not-distributed.

On 31 March 2018 20:15:07 GMT+11:00, Quoc Ho notifications@github.com wrote:

@CMCDragonkai From our discussion, if we use a centralized orchestrator, we don't really need to do discovery. (A central node knows the location of every service). Mostly it remains to redirect connections to the correct location. low level API to use is nftables (packet routing using BPF bpfilter is not yet ready).

Since the minimal API stuff is sort of covered by the Haskell bindings that VIvian is writing, and the service abstraction layer is pretty minimal (really just mapping into nftables), I think I'll start work on part 3, the mini orchestrator. Probably to start off with, I see this as a daemon that runs on the machine, similar to running an ipfs daemon.

The API should look something like, suppose we have some abstraction for machine on the network. Then mini orchestrator knows how to setup .nix configuration on the machine (If the Node is running NixOS, then we can represent an automaton environment using per user profiles? Alternatively, one virtual machine per automaton.) I think just having an orchestrator that is able to deploy nix configurations to multiple machines is a step forward from the current (there are orchestrators like Puppet that already do this for their own configuration files, but not for nixOS unless I'm mistaken.)

Thoughts?

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/MatrixAI/Relay/issues/2#issuecomment-377679038

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

CMCDragonkai commented 6 years ago

Deployment tools are irrelevant. Node orchestrators (are already an orchestrator tool) can receive architect expressions and create the relevant nix expressions or alternative side-effectful commands.

On 31 March 2018 20:15:07 GMT+11:00, Quoc Ho notifications@github.com wrote:

@CMCDragonkai From our discussion, if we use a centralized orchestrator, we don't really need to do discovery. (A central node knows the location of every service). Mostly it remains to redirect connections to the correct location. low level API to use is nftables (packet routing using BPF bpfilter is not yet ready).

Since the minimal API stuff is sort of covered by the Haskell bindings that VIvian is writing, and the service abstraction layer is pretty minimal (really just mapping into nftables), I think I'll start work on part 3, the mini orchestrator. Probably to start off with, I see this as a daemon that runs on the machine, similar to running an ipfs daemon.

The API should look something like, suppose we have some abstraction for machine on the network. Then mini orchestrator knows how to setup .nix configuration on the machine (If the Node is running NixOS, then we can represent an automaton environment using per user profiles? Alternatively, one virtual machine per automaton.) I think just having an orchestrator that is able to deploy nix configurations to multiple machines is a step forward from the current (there are orchestrators like Puppet that already do this for their own configuration files, but not for nixOS unless I'm mistaken.)

Thoughts?

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/MatrixAI/Relay/issues/2#issuecomment-377679038

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

ghost commented 6 years ago

Each node maintains synchronisation and knowledge about automatons. So "centralised" here doesn't mean not-distributed.

Yes, a daemon runs per node, but I imagine that daemon gets all it's information from the orchestrator (and may additionally have information about automatons that are running on the same node).

Node orchestrators (are already an orchestrator tool) can receive architect expressions and create the relevant nix expressions

Yeah okay. I'll forget about part 3 the orchestrator, and focus exclusively on ensuring that whatever the substrate is (docker container, unikernel), we can support migrating connections to the new location without the need for a load balancer. But do you still mean to tightly integrate with nixOS?

CMCDragonkai commented 6 years ago

The daemon is an orchestrator node. All daemons are part of the same orchestrator.

No need to worry about NixOS atm. It's only a potential subtrate mediated through the nix language.

On 1 April 2018 19:35:39 GMT+10:00, Quoc Ho notifications@github.com wrote:

Each node maintains synchronisation and knowledge about automatons. So "centralised" here doesn't mean not-distributed.

Yes, a daemon runs per node, but I imagine that daemon gets all it's information from the orchestrator (and may additionally have information about automatons that are running on the same node).

Node orchestrators (are already an orchestrator tool) can receive architect expressions and create the relevant nix expressions

Yeah okay. I'll forget about part 3 the orchestrator, and focus exclusively on ensuring that whatever the substrate is (docker container, unikernel), we can support migrating connections to the new location without the need for a load balancer. But do you still mean to tightly integrate with nixOS?

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/MatrixAI/Relay/issues/2#issuecomment-377774835

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

ramwan commented 6 years ago

So the (draft) roadmap looks like this:

Define a minimal API that Relay exposes to the other Matrix subsystems. It should allow us to query for nodes and their associated content addresses, and containers that are running on the nodes, as well as associated network configuration. Perhaps relay will also expose a control surface that allows us to manipulate the configuration of each node. Furthermore, we should have some notion of installing a container on a node, or a network that allows us to deploy specific configurations (assuming that NixOS is the target OS). 1.1 Start off with integrating with either LibP2P or doing a small peer discovery module over the regular transport.

Investigate the need for a service abstraction layer to facilitate migration of flows between front facing automatons as well as automatons that are internal to the network.

Write a mini orchestrator that either is part of Relay, or interfaces with relay to deploy a small container application to multiple node based on the content address of the node; if necessary, it may spin up a NixOS image and bring it to the necessary environmental configuration to install the container.

Determine how network communication inside the Matrix network occurs. When we want to implement automaton dependencies, we may want to specify exactly how two automatons communicate (this is selection of transport between two automatons).

Produce a small example of all the main features we want to support:

Content Addressed Environments An API that integrates with the container info collection on the Architect side to query a node for performance information Programmatic deployment of containers and environment to specific nodes Spinning up new nodes with NixOS Automatically configure NixOS environment on demand for a container deploy Data flow between peer nodes that is direct, not forwarded

Integration with other subsystems so that we can automatically deploy a container and environment configuration through the API (so that the Architect interpreter can determine where to deploy such container based on the QOS constraints).

I'd like to suggest an alternate roadmap because I think that this one focuses on development before necessary structural considerations are figured out. I've taken some points from the proposed roadmap quoted though as a base for what's below. Just as a heads up, I'd expect progress to be made on multiple points at once. For example, points 1 and 2 probably will be worked on at the same time and be influenced by point 4, but at least this list provides some insight into what I think should occur in a loose order.

Determine how network communication within a Matrix network occurs.
- specifics of various p2p networks
- level of decentralisation/distribution on the network
- protocol and algorithm specifications and implementations
- architecture of networks and implementation possibilities
- security
Determine methods of content addressing and service layer abstractions.
- What kind of need there is for a new abstraction
- Base requirements for this abstraction
- implementation possibilities and consequences on the network
Determine how network communications from outside a Matrix network occurs.
- Protocols
- Security
- Methods of access and access control
- Service discovery and usage
Define an API for communication/usage with other Matrix subsystems
Produce a sample of a set of features.

CMCDragonkai commented 6 years ago

I'm adding an overarching aim to resolve: to find out the mapping from the high level Architect composition operators (functional, object, union... etc) to the low level network implementation.

There are many solutions on the market right now that address bits and pieces, but nothing that perfectly matches what we are looking for, but we need to examine them, to derive some insights and build on top of their experiments.

We need to answer that question ASAP, and then proceed to build the bindings into them for integration into the language.

ramwan commented 6 years ago

Update to roadmap after close to 2 months.

Determine how network communication within a Matrix network occurs.

specifics of various p2p networks

level of decentralisation/distribution on the network

protocol and algorithm specifications and implementations

architecture of networks and implementation possibilities

security

Determine methods of content addressing and service layer abstractions.

What kind of need there is for a new abstraction

Base requirements for this abstraction

implementation possibilities and consequences on the network

Determine how network communications from outside a Matrix network occurs.

Protocols

Security

Methods of access and access control

Service discovery and usage

Define an API for communication/usage with other Matrix subsystems

Produce a sample of a set of features.

After having done an initial experiment based on point 2 (service addressing and migration possibilities), the importance of repeatedly building and testing designs and ideas is rather clear. The points above are still relevant although the roadmap may not be as linear and the ordering may not be as first proposed.

Roadmap directory now contains documents relating to things to do and directions to take.

Notable file is the orchestrator design doc which jots down thoughts on functionalities, layouts, data structures etc.

CMCDragonkai commented 6 years ago

@ramwan This issue should be split up into issues and organised under the project boards now.