edannenberg / kubler

A generic, extendable build orchestrator.
BSD 2-Clause "Simplified" License
157 stars 40 forks source link

Kubler 1.0.0 #208

Open edannenberg opened 2 years ago

edannenberg commented 2 years ago

As you probably all noticed, Kubler has been in maintenance mode for quite while now. In part because there was a lot going on in my life these past couple years (medical issues, changed job 2 times in the last year) but also developing a fairly complex project like Kubler in Bash is often a battle that is not exactly fun, even though it does have advantages in some areas.

Developing a project in your spare time should be fun and the language I do enjoy working with is traditionally not a good fit for Kubler as Clojure runs on the JVM, which is a bit of mismatch for light weight shell applications. The good news is that the Clojure community has a lot of bright minds and one of them recently came up with Babashka, which suddenly makes Clojure an excellent choice for Kubler.

Did some prototyping and the PoC looks promising so far, I'll most likely go ahead with the switch unless some deal breaker pops up unexpectedly. I hope to keep backwards compatibility for most things, though there probably will be a change to the config file format.

It's still very early but I feel this is a good point to gather some feedback. Are there any features or improvements you would like to see? Any pain points? Thanks!

r7l commented 2 years ago

First of all, thanks allot for all the effort and time you've put into this project even with the personal trouble and tasks you've had on your own. I hope your medical issues are better now and you can leave this chapter behind.

When it comes to the Bash language, i can only support the idea to move on. Bash is a good and stable language for allot of tasks and scripts. But once a projects scales up, it will easily reach a point of complexity most of the other languages are better suited for and the Babashka page actually sums it up quite well with Bash having allot of grey areas most people don't know about.

So far, i've never worked with Clojure or any other JVM based language. For this reason i am not really able to judge the outcome. But i've never heard of Babashka so far and i don't even see it being supported on Gentoo yet. The other issue, i see with this language is that the amount of users (rather limited, i'd guess) might stop people from contributing to the project. Not that the amount of users and contributors would have been huge in recent months / years. Well, it's your project, you're the main developer of it and i can truly understand the reasoning behind picking a language you feel good with. So it surely is up to you selecting the language.

There is another thing that comes to my mind: When it comes to resources, i would consider Bash to be pretty much "for free" as it just runs commands and doesn't eat much on it's own. But when thinking about a JVM (or similar), isn't it costing quite a bit of resources on it's own? So far, whenever i had JVM related software, it would eat RAM like cake. I am running Kubler on 2 dedicated machines (amd64 and aarch64) currently for building packages. Both of them come with 4 GB of ram, which does work well. But once a needed JVM eats up 1 GB on its own, it might start to be complicated when compiling larger packages like GCC.

Other then that, i am pretty happy with the current state of Kubler as it is. The tasks i am expecting it to do are done perfectly. So thanks again for making Kubler such a great tool.

edannenberg commented 2 years ago

Thank you for the kind words @r7l.

But i've never heard of Babashka so far and i don't even see it being supported on Gentoo yet.

Babashka is a fairly small (22mb currently) self contained binary, installing it comes down to pasting a one liner into your shell. I will also create an ebuild for it when the time comes.

The other issue, i see with this language is that the amount of users (rather limited, i'd guess) might stop people from contributing to the project.

That's a fair point. I think the entry point for Clojure, especially with Babashka, is very low. There will be plenty of code to look at and learn from so I feel it will be ok for people to provide fixes and additions if they want to. It might even be easier as the code base will be much simpler without all the arcane Bash rituals. I have worked with pretty much all of the mainstream languages in the past 20 years and Clojure is just a delight to work with.

There is another thing that comes to my mind: When it comes to resources, i would consider Bash to be pretty much "for free" as it just runs commands and doesn't eat much on it's own. But when thinking about a JVM (or similar), isn't it costing quite a bit of resources on it's own?

Babashka runs completely independent of the JVM, the performance impact should be very minimal if any.

r7l commented 2 years ago

I guess, i have to wait and see and i will most likely be around and test the upcoming version. An Ebuild for the language would be awesome.

r7l commented 2 years ago

It's been some months since my last comment and i've had a few thoughts about the new version here and there. Those are not features needed right now. Just stuff that might be handy for a new release. Not sure if you're still working on rebuilding Kubler with a different code base or catched some deal breakers.

build.sh files

What's the plan with these files? Due to the nature of Kubler currently being written in Bash, those files are Bash based. That is actually quite handy and i have more then a few of them with additional Bash code. You have this as well in your files. You plan to keep them Bash based? Like with Ebuilds being Bash based while Portage is written in Python. Not being able to add code in those files might be limiting.

Additional architecture support

Kubler currently supports the same set of architectures as Gentoo or Docker out of the box. This is great and works well. But having a Kubler image being used in multiple architectures at once, comes with a couple corners. Most notably the lack of proper keywording in some of the Ebuilds. For example: S6 does come with stable keywords for amd64 and x86 while being considered unstable on arm and missing out arm64 completely. The package in question does work on any of these.

Gentoo usually does support keywording architecture on it's own. Except when there is no keyword at all. That's when you have to use category/package ** which also unlocks any other version for any other architecture. Even worse it even unlocks live Ebuilds. There are ways to go around this and it's not a real blocker in any way. Keywording specific versions (which then would require manual editing on each update) or just adding some Bash code for the architecture and such.

I am not sure if this should be addressed in Flaggie instead when it's a Kubler only issue. In a real hardware environment, you would simply have different keywords on different hardware. It wouldn't collide. But with Kubler, you're able to reuse the images and build them for multiple architectures. So it might be a nice feature to be able to have the keywording or use flag functions to support architectures. Like this: update_keywords 'sys-apps/s6' '+**' 'arm64', which then would only apply the keywording on ARM64 architecture.

Technically it should be addressed in the Ebuilds. But i guess, getting all the Ebuilds in question with proper keywording is a long way to go.

Skipping images based on architecture

Another thing might be a flag / option to limit an image to a set of architectures (or just one) and skip the image on any other architecture.

Config changes

We've had this topic at some point and you've introduced sed_or_die, which is a realy great thing on its own. But even so, this is still missing out config changes. Like when you didn't change anything but the developers or maintainers changed defaults or add new config options. On a real system, you would notice this right after building due to Portage. I wonder if we could get some similar functionality with an option where you can define one or more files from the final build ROOT.

The files in question should be copied out and stored in some place in the package directory (like with PACKAGES.md) and later on, the files should be compared to any new file in any new build and maybe act similar to what Portage is doing. Like when you have nginx.conf and it changed. You may just drop a diff into a file named .nginx.conf.dff and once the entire build is finished, Kubler shows a summery of all the files or just a note with a summery option on it's own. Editing like etc-update or similar tools isn't an option.

edannenberg commented 2 years ago

Not sure if you're still working on rebuilding Kubler with a different code base or catched some deal breakers.

Not much time lately but its still on my backlog.

build.sh files

What's the plan with these files? Due to the nature of Kubler currently being written in Bash, those files are Bash based. That is actually quite handy and i have more then a few of them with additional Bash code. You have this as well in your files. You plan to keep them Bash based? Like with Ebuilds being Bash based while Portage is written in Python. Not being able to add code in those files might be limiting.

These will stay as shell scripts, the plan is to keep everything as backwards compatible as possible.

Additional architecture support

I guess some generic helper function that only does stuff on certain architectures should be doable.

Skipping images based on architecture

Another thing might be a flag / option to limit an image to a set of architectures (or just one) and skip the image on any other architecture.

There already is a --exclude arg for the build command that can be utilized for this.

Config changes

We've had this topic at some point and you've introduced sed_or_die, which is a realy great thing on its own. But even so, this is still missing out config changes. Like when you didn't change anything but the developers or maintainers changed defaults or add new config options. On a real system, you would notice this right after building due to Portage. I wonder if we could get some similar functionality with an option where you can define one or more files from the final build ROOT.

The files in question should be copied out and stored in some place in the package directory (like with PACKAGES.md) and later on, the files should be compared to any new file in any new build and maybe act similar to what Portage is doing. Like when you have nginx.conf and it changed. You may just drop a diff into a file named .nginx.conf.dff and once the entire build is finished, Kubler shows a summery of all the files or just a note with a summery option on it's own. Editing like etc-update or similar tools isn't an option.

Yea it would have to be for manual review, but I think it's useful feature that should be straight forward to implement. 👍

CryptoPunk commented 1 year ago

I tend to agree with many of the fears outlined above. Mainly, the incompatibility with existing portage infrastructure (bash and python-based), and the lack of mainline portage tree support for Babashka. I've never worked with it, but why not something like Hy? It's a dialect of lisp with a Clojure like syntax that's cross-compatible with python, allowing you to freely work with the portage python libraries. It seems like the perfect language if you want to move in that direction.

berney commented 1 year ago

I've been tinkering on #215 and on using kubler in a GitHub workflow.

I've now just started looking at docker buildx bake and it looks like it can do a lot of what we want, like the image (target) hierarchy, and will cache and parallelize a lot of things. It's currently an experimental feature.

Using ARGs in Dockerfiles #228, and a docker-bake.hcl.

It's also possible to define custom dockerfile syntaxes with a custom frontend (as an image) that will parse it and generate the LLB for buildkit. I don't think we'll need this, but it could be possible to add some syntax sugar, rather than a full custom syntax.

Leveraging these tools/features could reduce the scope of what kubler (or the nextgen rewrite in another language) needs to implement on its own.

Using some of these features it might be possible to have kubler as a docker image, like a customer dockerfile syntax (buildkit frontend), so that an end-user's repo could be just the dockerfile and bakefiles, and kubler is an image, the user runs the docker buildx bake command and kubler does it magic. This way the user just needs docker installed, and distribution is all git (for the end-user's repo) and docker registry (for kubler), so OS (Windows), bash versions or babashka etc doesn't matter as its containerised.

berney commented 1 year ago

Here is an experiment I did with using docker buildx bake on bob-portage: https://github.com/berney/kubler/tree/f-experiment-buildx-bake/engine/docker/bob-portage.

The docker-bake.hcl has comments with usage examples.

I decomposed the original kubler Dockerfile into two files. The first just gets the vanilla portage, and the second uses the first as the base image and applies the patches. This allows using just the second Dockerfile but with a different base image, e.g. the official gentoo/portage docker image (rather than downloading a portage snapshot, you download the portage docker image). The base image is specified via build-args.

The way the Dockerfiles are constructed you can use them directly, by specifying the build-args. So you don't need to using bake.

The docker-bake.hcl gives a nice high level interface to all this, like passing the build-args, setting variables, and grouping targets.

There are several options, by default it uses the upstream official gentoo/portage image as base image for kubler's portage image (it applies patches). Or you can use decomposed kubler style of copying the portage snapshot from the host.

I also vendored the gentoo/portage dockerfile, so you use their style which downloads the portage snapshot inside the container rather than copying it from the host.

I made it so you can copy portage snapshots from outside directories. This uses a bind mount, so it's much faster. I have 9.5GB in my ~/.kubler/downloads directory, without the bind mount the whole tree would be copied as part of the context, and be a lot slower.

BuildKit caches everything, and understands the dependencies between images/stages, and does things in parallel / pipelining.

I removed the volume on /var/db/repos/gentoo, because that causes the containers to copy the large directory which make the container start-up slow. I want immutable infrastructure, so I don't want to be updating portage inside the image. I want to build a new image instead.

To extend this to bob-core etc, would use bind mounts rather than volumes, so there's no unnecessary copying.

Here is a build without using the cache, that uses the decomposed kubler style of copying a snapshot from the host, but I'm copying one from an outside directory, my ~/.kubler/downloads/:

% SNAPSHOT=portage-20230423.tar.xz docker buildx bake kubler --load --set gentoo-portage.contexts.portage=$HOME/.kubler/downloads --no-cache
[+] Building 117.0s (21/21) FINISHED
 => [gentoo-portage internal] load .dockerignore                                                                                                 0.0s
 => => transferring context: 2B                                                                                                                  0.0s
 => [gentoo-portage internal] load build definition from Dockerfile.download                                                                     0.0s
 => => transferring dockerfile: 1.40kB                                                                                                           0.0s
 => [kubler] resolve image config for docker.io/docker/dockerfile:1                                                                              3.4s
 => [auth] docker/dockerfile:pull token for registry-1.docker.io                                                                                 0.0s
 => CACHED [kubler] docker-image://docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14         0.0s
 => => resolve docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409b14                             0.0s
 => [gentoo-portage context portage] load .dockerignore                                                                                          0.0s
 => => transferring portage: 2B                                                                                                                  0.0s
 => [kubler internal] load metadata for docker.io/library/busybox:latest                                                                         2.8s
 => [auth] library/busybox:pull token for registry-1.docker.io                                                                                   0.0s
 => [kubler context portage] load from client                                                                                                    0.1s
 => => transferring portage: 15.65kB                                                                                                             0.0s
 => CACHED [kubler stage-2 1/2] FROM docker.io/library/busybox:latest@sha256:b5d6fe0712636ceb7430189de28819e195e8966372edfc2d9409d79402a0dc16    0.0s
 => => resolve docker.io/library/busybox:latest@sha256:b5d6fe0712636ceb7430189de28819e195e8966372edfc2d9409d79402a0dc16                          0.0s
 => [kubler internal] load build definition from Dockerfile.kubler                                                                               0.0s
 => => transferring dockerfile: 1.62kB                                                                                                           0.0s
 => [kubler internal] load .dockerignore                                                                                                         0.0s
 => => transferring context: 2B                                                                                                                  0.0s
 => [gentoo-portage builder 2/2] RUN --mount=type=bind,target=/portage,from=portage <<-EOF (#!/bin/sh...)                                       26.4s
 => [kubler internal] load build context                                                                                                         0.0s
 => => transferring context: 300B                                                                                                                0.0s
 => [gentoo-portage stage-2 2/2] COPY --from=builder /var/db/repos/gentoo /var/db/repos/gentoo                                                  25.1s
 => [kubler patcher 1/4] COPY patches/ /patches                                                                                                  0.8s
 => [kubler patcher 2/4] WORKDIR /var/db/repos/gentoo                                                                                            0.0s
 => [kubler patcher 3/4] RUN <<-EOF (#!/bin/sh...)                                                                                               0.2s
 => [kubler stage-1 2/2] COPY --from=patcher /var/db/repos/gentoo /var/db/repos/gentoo                                                          22.5s
 => [kubler] exporting to docker image format                                                                                                   29.7s
 => => exporting layers                                                                                                                         14.7s
 => => exporting manifest sha256:c3b611bacc5d17a6a4d77dc9cc90f7ac4edfbf1f0c296b196e16d925c083e2ee                                                0.0s
 => => exporting config sha256:239cb2143c08bacd6de595218fa399aa8ca3b33dd28740e05aa2ee6063e3be4c                                                  0.0s
 => => sending tarball                                                                                                                          14.9s
 => importing to docker

And re-running the build again with the cache:

% SNAPSHOT=portage-20230423.tar.xz docker buildx bake kubler --load --set gentoo-portage.contexts.portage=$HOME/.kubler/downloads
[+] Building 8.2s (21/21) FINISHED
 => [gentoo-portage internal] load build definition from Dockerfile.download                                       0.0s
 => => transferring dockerfile: 1.40kB                                                                             0.0s
 => [gentoo-portage internal] load .dockerignore                                                                   0.0s
 => => transferring context: 2B                                                                                    0.0s
 => [kubler] resolve image config for docker.io/docker/dockerfile:1                                                3.5s
 => [auth] docker/dockerfile:pull token for registry-1.docker.io                                                   0.0s
 => CACHED [kubler] docker-image://docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb272456  0.0s
 => => resolve docker.io/docker/dockerfile:1@sha256:39b85bbfa7536a5feceb7372a0817649ecb2724562a38360f4d6a7782a409  0.0s
 => [gentoo-portage context portage] load .dockerignore                                                            0.0s
 => => transferring portage: 2B                                                                                    0.0s
 => [kubler internal] load metadata for docker.io/library/busybox:latest                                           2.8s
 => [auth] library/busybox:pull token for registry-1.docker.io                                                     0.0s
 => [kubler context portage] load from client                                                                      0.0s
 => => transferring portage: 15.65kB                                                                               0.0s
 => [kubler builder 1/2] FROM docker.io/library/busybox:latest@sha256:b5d6fe0712636ceb7430189de28819e195e8966372e  0.0s
 => => resolve docker.io/library/busybox:latest@sha256:b5d6fe0712636ceb7430189de28819e195e8966372edfc2d9409d79402  0.0s
 => [kubler internal] load build definition from Dockerfile.kubler                                                 0.0s
 => => transferring dockerfile: 1.62kB                                                                             0.0s
 => [kubler internal] load .dockerignore                                                                           0.0s
 => => transferring context: 2B                                                                                    0.0s
 => CACHED [kubler builder 2/2] RUN --mount=type=bind,target=/portage,from=portage <<-EOF (#!/bin/sh...)           0.0s
 => CACHED [kubler stage-2 2/2] COPY --from=builder /var/db/repos/gentoo /var/db/repos/gentoo                      0.0s
 => [kubler internal] load build context                                                                           0.0s
 => => transferring context: 300B                                                                                  0.0s
 => CACHED [kubler patcher 1/4] COPY patches/ /patches                                                             0.0s
 => CACHED [kubler patcher 2/4] WORKDIR /var/db/repos/gentoo                                                       0.0s
 => CACHED [kubler patcher 3/4] RUN <<-EOF (#!/bin/sh...)                                                          0.0s
 => CACHED [kubler stage-1 2/2] COPY --from=patcher /var/db/repos/gentoo /var/db/repos/gentoo                      0.0s
 => [kubler] exporting to docker image format                                                                      0.6s
 => => exporting layers                                                                                            0.0s
 => => exporting manifest sha256:c3b611bacc5d17a6a4d77dc9cc90f7ac4edfbf1f0c296b196e16d925c083e2ee                  0.0s
 => => exporting config sha256:239cb2143c08bacd6de595218fa399aa8ca3b33dd28740e05aa2ee6063e3be4c                    0.0s
 => => sending tarball                                                                                             0.6s
 => importing to docker                                                                                            0.0s

Launching the image is much faster without the portage volume:

% time docker run --rm -it kubler-gentoo/portage grep TIMESTAMP /var/db/repos/gentoo/Manifest
TIMESTAMP 2023-04-26T00:39:40Z
docker run --rm -it kubler-gentoo/portage grep TIMESTAMP   0.01s user 0.04s system 3% cpu 1.288 total

Using bake will make #228 easy, because the bake file can include what build args (and what values).

To get Kubler to work on BuildKit, #215, requires a lot of changes due to the images being in the builder rather than local. Kubler using bake will make this transition easier.

It will also pick up the caching, parallelism, and performance, and integrating in CI easier. Currently naive CI will spend 17mins building bob-musl, bob-glibc every time, but if kubler used buildkit, then CI (at least GitHub Workflows) has easy to implement caching of docker (BuildKit) caches. Otherwise to speed up CI, the workflow needs a lot more development effort spent rolling own caching, export the images, adding them to a cache, and importing them. It's not that hard to do and I already cache the downloads and distfiles etc, but since BuildKit support is needed anyway I'd rather work on that.

My plan is to keep working on this to add the builders (bob, bob-musl), bob-core, to get kubler using BuildKit, so CI will be performant, and can use new features like multi-arch images, SBOM attestations, etc.

r7l commented 1 year ago

@berney Are you in the Kubler Discord?

If you need someone to test things i am happy do that. But i won't be able to start before next week.

berney commented 1 year ago

I am, I'll jump on later (another day).

Kangie commented 1 year ago

I've been discussing bringing this package into ::gentoo under the Containers project on IRC; it's proven to be incredibly useful to me over the last little bit! Excited to see progress towards 1.0.0.

r7l commented 1 year ago

There are a couple of issues that might prevent Kubler from being added into the main Portage tree just like that. The first one is that a number of Ebuilds are not really suited for Kubler out of the box.

Another big issue i see would be the state of the user eclasses. The amount of patching needed for those eclasses is constantly growing as they've been updated to work with $ROOT variable in a way Kubler does not really support. Everytime someone edits those eclasses Kubler is broken up until the patches have been updated.

Another one is missing dependencies as most Ebuilds assume a real Gentoo system with certain packages being around just like that. So maintainers just leaving them out because of it. Those Ebuilds usually fail with Kubler and need manual handling within the Kubler image.

Since there is some sort of official Gentoo Docker images out there already, i don't really see them moving to Kubler sometime soon. Even if i would like to see it.

For a start, it would be a great thing if the Kubler overlay would be added to the list of known Gentoo overlays and Kubler could be found on search tools like Zugaina.

thesamesam commented 1 year ago

There are a couple of issues that might prevent Kubler from being added into the main Portage tree just like that. The first one is that a number of Ebuilds are not really suited for Kubler out of the box.

Another big issue i see would be the state of the user eclasses. The amount of patching needed for those eclasses is constantly growing as they've been updated to work with $ROOT variable in a way Kubler does not really support. Everytime someone edits those eclasses Kubler is broken up until the patches have been updated.

That's really something we just need to hash out together rather than something insurmountable.

I understand the status quo isn't ideal, but it's somewhat out of our control given you can't use NSS modules from within a ROOT safely (which is why you need user accounts installed in silly places for e.g. chown to work). But maybe we can figure something out. Talk to us!

Another one is missing dependencies as most Ebuilds assume a real Gentoo system with certain packages being around just like that. So maintainers just leaving them out because of it. Those Ebuilds usually fail with Kubler and need manual handling within the Kubler image.

We're not against adding such dependencies, they're just easy to miss. They're important for a bunch of cases (cross-compilation where a ROOT is often minimal, or bootstrapping a new ROOT, or prefix, or ...).

Since there is some sort of official Gentoo Docker images out there already, i don't really see them moving to Kubler sometime soon. Even if i would like to see it.

For a start, it would be a great thing if the Kubler overlay would be added to the list of known Gentoo overlays and Kubler could be found on search tools like Zugaina.

Someone just needs to submit a PR to https://github.com/gentoo/api-gentoo-org to add it to repositories.xml.

r7l commented 1 year ago

@thesamesam My comment wasn't critical to the state of Kubler or Gentoo in relation to Kubler. Just listing some of the issues i could think of. Having Kubler in main portage tree would be awesome.

I understand the status quo isn't ideal, but it's somewhat out of our control given you can't use NSS modules from within a ROOT safely (which is why you need user accounts installed in silly places for e.g. chown to work). But maybe we can figure something out. Talk to us!

There is #209 where you've already replied at some point. I am not involved enough into this issue. So far @edannenberg jumped in and worked out the issues whenever it broke. But i think he might be interested in a working solution for the future as this is repeatedly popping up.

We're not against adding such dependencies, they're just easy to miss. They're important for a bunch of cases (cross-compilation where a ROOT is often minimal, or bootstrapping a new ROOT, or prefix, or ...).

I'll try to look out for those and report them back in the future.

Someone just needs to submit a PR to https://github.com/gentoo/api-gentoo-org to add it to repositories.xml.

I know how it works and if i am not mistaken, it's also a requirement for the maintainer to have a working account on https://bug.gentoo.org. So i would leave that to @edannenberg as it is his repository in the first place.

Kangie commented 1 year ago

Another one is missing dependencies as most Ebuilds assume a real Gentoo system with certain packages being around just like that. So maintainers just leaving them out because of it. Those Ebuilds usually fail with Kubler and need manual handling within the Kubler image.

That's actually why I'm so excited to bring more disposable, lightweight containers into ebuild development. The tinderboxes can only catch so much as they do come with a full base system. I'll see what I can do!

edannenberg commented 1 year ago

I've been discussing bringing this package into ::gentoo under the Containers project on IRC; it's proven to be incredibly useful to me over the last little bit! Excited to see progress towards 1.0.0.

Glad to hear it's been useful to you. So far, there still have been too many other things on my plate this year to make progress on this, but we will get there eventually. :+1: