aurae-runtime / aurae

Distributed systems runtime daemon written in Rust.
https://aurae.io
Apache License 2.0
1.84k stars 89 forks source link

Create "Container Service" #435

Open krisnova opened 1 year ago

krisnova commented 1 year ago

This will sound a bit confusing at first, but just go with me on this.

We need a new "top level" service that sits alongside the existing "Cell Service".

This should be an extremely lightweight container service that runs containers without the concept of CRI or a Pod Sandbox. The new "container service" will be called from a host auraed to schedule a container in a new pod sandbox.

Example

Here is an example "model" to help me communicate what I am thinking.

Imagine a new host called "alice" which is effectively a bare metal server running in a rack. Alice will run a Linux operating system with auraed as pid 1 and we will call this initial instance of auraed the Host Auraed.

A user remotely schedules 2 pods on Alice (auraed-guest-1 and auraed-guest-2) each has their own nested instance of auraed.

The Host Auraed first creates the guest for each pod, and then calls out to the cell service being suggested in this GitHub issue to schedule containers on each guest. The containers in each guest are free to communicate with each other using the guest filesystem, and the guest namespaces where applicable.

alice/
├── auraed                       # Host Auraed
├── auraed-guest-1               # Guest Pod 1
│   ├── auraed                   # Guest Auraed 1
│   └── container-service.rpc    # New unwritten RPC service
├── auraed-guest-2               # Guest Pod 2
│   ├── auraed                   # Guest Auraed 2
│   └── container-service.rpc    # New unwritten RPC service
├── cell-service.rpc             # Host cell service; sits alongside other RPCs, health, etc
└── cri.rpc                      # Host CRI where we user calls "RunPodSandbox"
krisnova commented 1 year ago

Related to #433

bpmooch commented 2 weeks ago

@dmah42 assign this to me? it is very important for my use case

dmah42 commented 2 weeks ago

i never quite got my head around how this sits alongside the CRI or Pod sandbox concept, but if we're moving to a place where a guest can run "bare-metal" Cells, or VMs (cloud-hypervisor), or Containers (this service) then i'm happy with that as a direction.

mccormickt commented 2 weeks ago

It would be nice to (eventually) have CRI support to allow for something like an auraed-managed kubelet spawning pods in a nested cell or in a VM (using aurae as its container runtime). I'm not sure, though, that this is something we'd want to prioritize for the short term goals of the project?

if we're moving to a place where a guest can run "bare-metal" Cells, or VMs (cloud-hypervisor), or Containers (this service) then i'm happy with that as a direction.

To this point, I think having support for pulling/running container images normally would be a great start (similar to our approach with the VM service). Once we have that functionality, we can iron out how/when we'd want to implement CRI, whether with cells or VMs (or both!) as the Pod boundary.