Open fheinecke opened 5 months ago
Thank you @fheinecke for cutting such a detailed issue. We have discussed Kata Containers in https://github.com/bottlerocket-os/bottlerocket/issues/812 as well. You provided a lot of data and its taking a bit to work through it so I wanted to let you know I've seen this and I'm working on a response so I'll come back here with more details as I have them.
I wanted to provide a bit of an update from the discussions that have happened offline.
Would the Bottlerocket project be willing to accept a PR for (1) or (2)? I'm currently willing to put in most of the work here, but I'd like to know beforehand if there is some version of this that the project would accept.
I can rule out option 1. That is quite a bit of additional software in existing variants that only have limited use in much of EC2 since you need to be using bare metal instances for the virtualization to work. This is a core reason why we have variants: to allow users to choose between these types of use cases while keeping their images minimal. Kata containers is enough of a departure from the other existing variants that it would warrant its own variant just like how NVIDIA use cases were enough of a departure to warrant their own set of variants.
Kata could add support for Bottlerocket in their install tooling.
As you called out in option 3, there are a lot of downsides and I think we agree that it is less than ideal.
We have been investing in tooling to make building and maintaining your own variant significantly easier, so I'd like to focus on Option 4 instead of 2. We recently broke out the package definitions into the bottlerocket-core-kit with a primary goal of enabling much better support for this option or options like it. This path isn't without its own work to figure out how the tooling enables you to build and maintain your own variants with these types of changes, but the Bottlerocket team is actually pretty excited about the possibly of working together to make a version of this option viable for everyone involved. I think there is a lot of merit in figuring out what might work with Option 4.
I'll work on collecting more thoughts about the technical steps needed, but as the first pass, we need to build in some ability to configure the SELinux contexts appropriately for Kata containers. A good starting point would be to create a fork of this repo as it exists now and start trying to prototype out this enablement by adding your own variant definitions and adding packages for Kata containers in the fork. This would enable reviews to happen on this code and guidance around any challenges you run into.
Hi folks,
I'd like to open an issue for adding Kata Containers to Bottlerocket OS. The Kata Containers project adds a new runtime to support running containers inside lightweight VMs, such as Firecracker VM or Cloud Hypervisor. The two projects are security-oriented and I think that they complement each other quite well. Kata Containers provides a level of isolation between "containers" that is not normally achievable with the typical namespace/cgroup approach that most CRIs take, and Bottlerocket OS's secure-by-default approach helps limit the impact of any vulnerabilities in the container runtime.
I have a working proof-of-concept of installing Kata Containers after the node has started, however, it requires spawning a pod with
super_t
context and has some other security risks. I've talked with the Kata team and we feel that the most secure solution would be to include the binaries in the OS image. However, it would be possible to add out of the box compatibility to the upstream Kata project if it cannot be added here, at the cost of being a less-secure solution.There are several ways that Kata could be added to Bottlerocket:
Bottlerocket could include it in all images. This would make it really easy for users to get started with containers as VMs. All users would need to do is specify the desired runtime when starting a container, via
docker run --runtime containerd.kata.v2
or a k8s runtime class.The downside is that many users might not want to use kata containers. Including these packages would add about a gigabyte of disk space, and would add some additional processes that are running all the time. Part of this could potentially be mitigated by adding a toggle in the settings to enable or disable the runtime. The image would include everything needed to get started (binaries, config, selinux policies), but containerd would not be configured to start these processes until explicitly enabled.
super_t
access./opt
, which means that they could be overwritten with a malicious version.super_t
actually needs more permission than it already has so that it can relabel the Kata runtime binaries asruntime_exec_t
. Due to adenyalways
statement, this requires that selinux be temporarily disabled, globally, at runtime for processes with thesuper_t
context.Would the Bottlerocket project be willing to accept a PR for (1) or (2)? I'm currently willing to put in most of the work here, but I'd like to know beforehand if there is some version of this that the project would accept.
What I'd like: Kata Containers deployed with Bottlerocket OS
Any alternatives you've considered: See discussion above