Integration into Container Runtime (runc)

CMCDragonkai commented 6 years ago

At some point we need to integrate our network fabric into the Emergence container runtime (which is currently planned to be runc).

It's important to understand that runc is a daemonless program. That is unlike the Docker container runtime, it is not a client-server architecture with a daemon. Being daemonless is more portable and flexible since it is easier to integrate with other container orchestrators. This is probably one of the reasons why runc is popular among different container orchestrator implementations. However being daemonless means that most interactions are best done in a stateless manner. However container runtimes involves a lot of state. State that is required to track what containers are currently running and to manipulate them. In this case, runc manages to preserve state by making use of the filesystem. In particular we are concerned with the initialisation stages.

runc demonstrates that there are 2 stages in the initialisation of containers.

runc create
runc start

The runc create command will fork and exec a separate "container" process. This is not the entrypoint program inside a container, but simply the private command runc init. Note that runc init is similar to private executables in the GNU libexec style. "Library executables" is a programming pattern intended to encapsulate functionality between process boundaries. This is usually to do with some operating system state that cannot be cleanly separated within a single process or between threads of a single process. Anyway this process is the "container" process. It will be the process that has all the container isolation primitives applied, but it won't yet execute the container entrypoint. It basically sits idle waiting on a signal via a filesystem pipe. At this point the runc create process will terminate.

The runc start then looks for the "container process" signals via the filesystem pipe and then waits for some answer. The "container" process begins its second stage initialisation before finally executing the container entrypoint.

The relevant steps for runc create are:

create.go
utils_linux.go
- utils_linux.go loads a factory from libcontainer/factory.go and libcontainer/factory_linux.go
- using CT_ACT_CREATE it will start the "container" process via libcontainer/container.go and libcontainer/container_linux.go
- libcontainer/container_linux.go calls libcontainer/process.go and libcontainer/process_linux.go to spawn runc init

The relevant steps for runc init are:

init.go
- init.go calls into libcontainer/factory.go and libcontainer/factory_linux.go
- libcontainer/factory_linux.go eventually calls init_linux.go and standard_init_linux.go (these have some modularity design decisions I'm not entirely understanding)
- the Init method in standard_init_linux.go is where all the isolation primitives are setup for the "container" process

The factory is basically something that creates containers. There's only the Linux implementation.

A runner is something that wraps the container object and provides methods to run or destroy or terminate the container.

Note the usage of domain modeling in the code. Unlike Haskell, there's no type for IO. So you don't know which functions perform side effects and which are just pure expressions and manipulation of the domain models. However there is a hint, if you see functions returning error, it is possible that the function is side-effectful.

Furthermore becareful with the constructor functions. Sometimes the code uses struct initialisation via &name{} and sometimes they use constructor methods, and other times they are using factory functions. There's lots of different idioms being used in the runc codebase.

The fact that there is this stage separation between runc create and runc init means that this phase in between may be where we can integrate the Relay fabric into the intra-host and inter-host connections between our Automatons.

Remember to use sourcegraph to view the Go code.

Here is some other notes which may be of use:

runc create:
  * calls startContainer with action set to `CT_ACT_CREATE` in utils_linux.go
  * startContainer calls createContainer, uses a factory to Create
  * initialises a runner struct, and calls the run method of the runner struct
  * since it is CT_ACT_CREATE, it runs r.container.Start(process)
  * the Start method is is implemented in libcontainer/container_linux.go (as an implementation of the generic base container interface)
  * Start is a public method wrapping a private method start
  * private method start calls private method newParentProcess
  * newParentProcess method will call private method newInitProcess
  * returned to private method start to eventually call process.start, this start i s in the process_linux.go
  * this ends up calling runc init which is a private command
  * runc init will call into factory_linux.go StartInitialization
  * and this a variety of methods and structurs provided by init_linux.go and standard_init_linux.go
  * the call graph showing the runc init is at the very bottom of the diagram at https://github.com/MatrixAI/Emergence/blob/master/container/libcontainer-callgraph-nointer.svg showing (*LinuxFactory).StartInitialization calling into linuxStandrdInit

CMCDragonkai commented 6 years ago

Please assign to the next project.

nzhang-zh commented 5 years ago

Network namespace setup with runc

A path to a named namespace file can be specified in config.json. The container will then run in this namespace. ip netns creates named network namespace file under /var/run/netns/. We can potentially create network namespaces for every automaton instances and point runc to it.

Alternatively, we could let runc create new network namespaces if we omit the path. The namespace file can then be accessed at /proc/<pid>/ns/net. Symlink to /var/run/netns required for iproute.

MatrixAI / Relay

Integration into Container Runtime (runc) #20

Network namespace setup with runc