bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.39k stars 497 forks source link

Add Kata Containers to images #4070

Open fheinecke opened 5 days ago

fheinecke commented 5 days ago

Hi folks,

I'd like to open an issue for adding Kata Containers to Bottlerocket OS. The Kata Containers project adds a new runtime to support running containers inside lightweight VMs, such as Firecracker VM or Cloud Hypervisor. The two projects are security-oriented and I think that they complement each other quite well. Kata Containers provides a level of isolation between "containers" that is not normally achievable with the typical namespace/cgroup approach that most CRIs take, and Bottlerocket OS's secure-by-default approach helps limit the impact of any vulnerabilities in the container runtime.

I have a working proof-of-concept of installing Kata Containers after the node has started, however, it requires spawning a pod with super_t context and has some other security risks. I've talked with the Kata team and we feel that the most secure solution would be to include the binaries in the OS image. However, it would be possible to add out of the box compatibility to the upstream Kata project if it cannot be added here, at the cost of being a less-secure solution.

There are several ways that Kata could be added to Bottlerocket:

  1. Bottlerocket could include it in all images. This would make it really easy for users to get started with containers as VMs. All users would need to do is specify the desired runtime when starting a container, via docker run --runtime containerd.kata.v2 or a k8s runtime class.

    The downside is that many users might not want to use kata containers. Including these packages would add about a gigabyte of disk space, and would add some additional processes that are running all the time. Part of this could potentially be mitigated by adding a toggle in the settings to enable or disable the runtime. The image would include everything needed to get started (binaries, config, selinux policies), but containerd would not be configured to start these processes until explicitly enabled.

  2. Bottlerocket could create a new variant (or variants) with this package. This would be more to maintain, but would exclude the package from the "normal" variants so that it's not included in every image. I believe that this would add 15 more image if a Kata variant was added for all current variants that support k8s.
  3. Kata could add support for Bottlerocket in their install tooling. This would require little to no change on Bottlerocket's end. The downside is that this objectively less secure than including it in a Bottlerocket image. Here's specifically where some of the security issues lie:
    • Installation requires super_t access.
    • Kata binaries are included under /opt, which means that they could be overwritten with a malicious version.
    • The super_t actually needs more permission than it already has so that it can relabel the Kata runtime binaries as runtime_exec_t. Due to a denyalways statement, this requires that selinux be temporarily disabled, globally, at runtime for processes with the super_t context.
    • If installing on k8s via a daemonset (as is the standard process for Kata on k8s), there will be a long running pod with these permissions and several host mounts.
      1. The company I work for might be willing to maintain a variant as described in (2) for as long as we use both bottlerocket and kata. The downside is that if we stop using either of these projects at any point, we would probably also stop supporting this variant. Additionally, anybody who wanted to use these images would need to trust us as much as they trust the bottlerocket project.

Would the Bottlerocket project be willing to accept a PR for (1) or (2)? I'm currently willing to put in most of the work here, but I'd like to know beforehand if there is some version of this that the project would accept.

What I'd like: Kata Containers deployed with Bottlerocket OS

Any alternatives you've considered: See discussion above

yeazelm commented 3 days ago

Thank you @fheinecke for cutting such a detailed issue. We have discussed Kata Containers in https://github.com/bottlerocket-os/bottlerocket/issues/812 as well. You provided a lot of data and its taking a bit to work through it so I wanted to let you know I've seen this and I'm working on a response so I'll come back here with more details as I have them.