Containerization support for more deployment setups for the boefjes

Lisser commented 1 year ago

Copied from closed-source issue.

@Donnype said:

Intro

We have been talking a lot about containerized boefjes but in practice there is no proper support yet. This ticket captures that we want to finally start properly supporting (a part of) thought out use-cases. To link the right people: @underdarknl @ammar92 @noamblitz @Lisser @dekkers please comment if this list is missing important ones or should not contain specific use-cases.

Context

Types of containers

There are several types of containerization we could start implementing we have discussed in the past:

LXD
FireCracker (for normalizers specifically)
OCI (e.g. Docker) images

These have their own set of challenges:

LXD has been implemented partially, but poses challenges since there is not a big community around it and is hard to run locally (no support for mac users if my memory serves me right).
FireCracker is more of a micro-VM that does not have out-of-the-box support for building images. Having said that, it seems possible to build FireCracker images from Docker images by taking a built Docker image and mounting the filesystem in the FireCracker VM.
OCI can be interesting in terms of support and community. This would also serve most user since a lot run KAT using Kubernetes. Docker builds OCI images, which makes life easier given the team's experience as well.

Types of container registries

Moreover, each specific kind of containerization requires supporting a repository of container images to expose both in the Katalogus for information as well as the workers to pull images from:

LXD needs the custom plugin repository
FireCracker: not completely sure yet, but here are some references:
- https://stackoverflow.com/questions/53938944/firecracker-microvm-how-to-create-custom-firecracker-microvm-and-file-system-im
- https://github.com/firecracker-microvm/firecracker/blob/main/docs/rootfs-and-kernel-setup.md
OCI has specifications for registries such as Dockerhub or cloud container registries

The view of supporting multiple registries was defined as deploying the Plugin Repository around each image and use the same client everywhere:

Figure 1

This is easy when there is one kind of repository, and using our custom plugin repository is justified in the LXD case. However, when we are going to support OCI images as well, using a Plugin Repository becomes a lot of overhead since it is another component to deploy and we have to translate OCI images into our own artifacts.

Proposal

Priorities in containerization support

My proposal to tackle this is as follows. First and foremost, I believe that priority of support should be driven by what the KAT users need the most. But even more important to note here, is that the most intensive KAT users at this stage are the KAT developers, so their setup should remain easy to work with. I think it would be an awesome feature if we could have the following flows in KAT:

Local devs: build kat and start running a worker/katalogus locally with only the LocalRepository. Copy-paste a boefje folder into a new boefje folder and change the definition file and logic to create a new boefje. Refresh the katalogus and your boefje is available to run.

Community using Kubernetes: deploy KAT using your own helm charts in a kubernetes/cloud environment. Use either the Katalogus interface in Rocky or the env-file to add your Cloud Boefjes-Container Registry (BCR) and all relevant configurations. You immediately see all boefjes/normalizers available from this BCR. You can build and push a new boefje locally or in a CI/CD pipeline (where you set all configuration parameters such as name, consumes, produces as labels for instance), using perhaps some tooling we provide. Once pushed to the BCR and have it available immediately in the Katalogus after a refresh. (This is the idea I tried to explain to Reinoud @noamblitz.)

If as a user I want to add some developer's boefjes that are hosted on dockerhub, I would not want to deploy a whole new instance of a plugin repository to enable it. Schematically this means moving towards the following model:

Figure 2

This would allow us to leverage common libraries/SDK's and registry APIs without the intermediate step of a plugin API client and reduce the amount of services to deploy. I think always adding this extra step only makes sense when we decide to first cast all containerization tooling into out custom image format and standardize how we run containers as well. In practice I think it would be way too much work to support e.g. OCI registries and images by casting it in our own model when this can be done quite easily through popular SDK's. If we would still use these SDK's in the Plugin API wrapper as displayed in Figure 1, the wrapper becomes as thin as a sheet but hold a lot of complexity: deploying a whole extra service and supporting complicated containerization translations.

Therefore, I think that Figure 2 would be the way to go in the foreseeable future, and in that case the following priority makes the most sense to me:

We need to pick features out of these we do want to implement and decide when to implement them. The first three would mean full OCI support in a cloud environment using Kubernetes. The third one would also be the setup for a hybrid environment where only the boefjes/normalizers are run in a container, and would only need the last one to be implemented as well to support this. At the time of writing I am not sure how far we are from a functional LXD setup.

Important: implementing these features should not make kubernetes, LXD or FireCracker a requirement for local development in my opinion.

Overview of potentially different architectures

Assuming we want to start workers for just one type of containerization (input from @underdarknl), the schematic flow of control looks as follows:

Whether or not we want to use a private PyPi registry or custom FireCracker VMs wrapped in a Plugin Repository is open to discussion.

Note on extensibility Katalogus

I have not discussed the implications this has for the current repository models in the Katalogus and how we would be able to share Katalogi/Plugin Repositories between instances. I think however that this should not be too complicated: a Katalogus instance could expose all public repositories through an API for instance, and another Katalogus instance could copy all repositories from this API to its own database for instance.

Conclusion

These thoughts have been the result of several (past) discussions within the team, talking to some active users of Kat and a lot of contemplation about the architecture from my side. Let me know what you think, feedback is more than welcome!

Lisser commented 1 year ago

@ammar92 said:

Thanks for sharing this @Donnype! I do recognize many of the ideas and suggestions you made.

I would like to add some remarks about what we already have, particularly about LXD and the Plugin Repository. A few months ago I continued working on an LXD PoC by @errieman and extended this to an almost fully working LXD pipeline and Plugin Repository:

An LXC base image with an entry point/ bootstrap script to run a plugin
Tooling to build an LXC image out of a plugin source
A basic LXD runner
Protocol between the runner and the entry point to bootstrap the plugin and pass the correct arguments
The Plugin Repository as (pretty much) it is right now
- A basic simplestreams implementation (for LXD) integrated into the plugin repository, which is pretty cool because we're one of the few that have this. This allows pulling and verifying images from the plugin repository using the standard LXC tools
A basic (YAML-based) manifest for the plugins based on the properties suggested by @underdarknl. I'm not sure when and why these are eventually transformed to JSON, since the manifest and its YAML encoding are equally adequate.

By the way, LXD doesn't strictly need a (custom) plugin repository, the plugin repository is merely an abstract registry that could hold images and code for a variety of image types (e.g. LXD, docker, and even source code, python packages or binaries).

I very much like your overview of the potentially different architectures. It seems like the correct way to go with the support for different runtimes as workers.

The only thing that still bothers me a lot is the current way we have the 'local' plugins. Sooner or later the current implementation will reach its limit due to e.g.:

Errors while resolving transitive dependencies
Not isolated at all, plugins share the same packages and environment
Idempotency might be an issue, e.g. running it from a different platform could result in different packages/ binaries and eventually results

Some isolation might be achieved using e.g.:

Python eggs (this is of course a joke)
Virtual envs managed by the runner

Of course, this is all temporary, since the end goal eventually is to containerize the plugins, both local and remote plugins.

Lisser commented 1 year ago

@Donnype said:

I realised that one assumption in my overview is that we want to support OCI images, but I think this will be the easiest way to facilitate a plethora of community boefjes due to its popularity. This would mean extending to other languages due to the standardised entrypoint specification (where using LXD for other languages right now would mean that we have to support different build tools and add entrypoint support etc. ourselves).

@ammar92 I can indeed see the issues there, but I see the local setup mostly as a non-production way to get started with developing KAT. I think the first two issues can be fixed with venv's by either using your TemporaryEnvironment setup and update the virtual environment, and/or start running boefjes in a subprocess. Then for the third I think we could create tooling that builds OCI images from the boefjes/normalizers, after which you can add your local Docker images as a registry?

Lisser commented 1 year ago

@dekkers said:

I agree that we should start with how users want to deploy OpenKAT. As far as I know there are three different ways people are running or told us they want to run OpenKAT and we also want to support are:

On Debian and Ubuntu with our Debian packages
On Nomad with our container images
On Kubernetes with our container images

With Nomad and Kubernetes you will want to run your boefjes in the cluster and OCI images/registries work fine for that. They both also support using microVMs with Kata containers (and Kata supports both QEMU and Firecracker). And with Kubernetes you could also use gVisor.

For Debian/Ubuntu we can just support docker and/or podman to run the boefjes containers (Or maybe talk to containerd directly? Not sure about that, would need to investigate). If you want to use microVMs you can also configure containerd to use Kata as far as I can see. This will all work with OCI images and registries too.

Tthe fact that those are open standards that are in wide use give a lot of advantages:

There are a lot of different registry implementations. We don't have to reinvent the wheel here. People can also put their boefjes on any registry they already have: docker hub, github, gitlab, etc. Github/gitlab container registry provide also nice way of pushing them in CI.
Tools for proxying/whitelisting images can be reused.
A lot of tooling to build images already exist.
There is also tooling for security checking such as trivy.

Another benefit as Donny mentioned is that if we define the boefjes/normalizer interface as OCI image/container (e.g. how the container is started, what input it gets and what it should output) it would also be possible to implement boefjes in another programming language and such a boefje would run on every KAT installation.

Given that firecracker can be used via Kata I don't think we have a need for a Firecracker specific images / runner unless I am missing some big advantage that would result from using Firecracker directly.

For local development you can just build a new docker images with docker build or whatever tool you want to use to build your image. So I don't really understand why we would need something like LocalRunner / virtualenvs. What am I missing here?

Then there is only one last thing that doesn't support OCI and that is LXD. But to be honest I don't really understand the reasons why we would want to support LXD, because the more I look at it, the more I don't like it. First of all LXD ignores all the open standards that exist for containers and only support its own things.

But to my surprise it also doesn't seem to be completely open source. I was looking at what kind of API they have and how to do authorisation (so that the boefje running cannot do more than start/stop boefjes containers), but apparently if you want RBAC you need the proprietary Canonical RBAC service that is only available with an Ubuntu Advantage subscription: https://discuss.linuxcontainers.org/t/security-questions/8946/2 So apparently if you want to create a secure setup you need to get proprietary stuff from Canonical, so LXD seems to be more open core than open source.

I don't really see any advantage LXD brings over the other deployment options, I don't know of any potential user of OpenKAT saying they want to use LXD, but it very clear that it will be a lot of work to support it because they refuse to implement the open standards that exist.

So my proposal would be to only support OCI images and registries.

For the katalogus the boefjes and normalizers are then mostly OCI image names. It is possible to add KAT specific metadata to images, for example a normalizer could have a field that list the mimetypes it support. Docker and others implement a catalog endpoint, but not everyone does it the same, so it is not standarized unfortunately: https://github.com/opencontainers/distribution-spec/pull/45#issuecomment-521425185 So using such an interface to fetch all boefjes won't always be there.

A simply solution for discovery might be that we just implement a simple json file/endpoint that would just list all boefjes/normalizer containers. For all public boefjes we could put a generic list on github using PRs to update and actions to publish it on pages. Anyone who wants to run private boefjes from a private container registry could either create such a json file somewhere or just configure every individual boefje in the katalogus with the private container registry URL.

minvws / nl-kat-coordination