edannenberg / kubler

A generic, extendable build orchestrator.
BSD 2-Clause "Simplified" License
156 stars 40 forks source link

Kubler Build breaks with docker buildx #215

Open berney opened 1 year ago

berney commented 1 year ago

If docker buildx is being used, due to either docker buildx install or setting envvar DOCKER_BUILDKIT=1. kubler build will break.

I am interested in using buildx for several reasons:

  1. Multi platform/architecture images
  2. Caching (especially in CI/CD)
  3. It's the future

I've been working on running kubler in CI/CD. I am caching the ~/.kubler/downloads, distfiles, and packages, which makes subsequent runs much faster. But it is still fairly slow compared to running locally because it is building all the docker images, including portage and builders (kubler/bob, kubler/bob-musl, etc) on every run, which involves copying/loading large files (portage, and stage3s).

I want to cache the docker images between runs, and I want to use buildx, which can cache layers in a docker registry, which is nice for CI/CD.

Trying to use buildx causes problems in kubler, because the buildx builder is a docker-container, the images are inside the builder container. This causes the docker tag ...:latest operations to fail as the image doesn't exist on the localhost.

berney (main)[1] % kubler build -v kubler/busybox
»»»»»[init]» generate build graph
»»» required engines:    docker
»»» required stage3:     stage3-amd64-musl-hardened
»»» required builders:   kubler/bob-musl
»»» build sequence:      kubler/busybox
»[✔]»[init]» done.
»»»»»[portage]» download portage snapshot
--2022-09-12 01:28:46--  http://distfiles.gentoo.org//snapshots/portage-latest.tar.xz
Resolving distfiles.gentoo.org (distfiles.gentoo.org)... 143.244.62.5, 2a02:6ea0:db00::1
Connecting to distfiles.gentoo.org (distfiles.gentoo.org)|143.244.62.5|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 43260048 (41M) [application/x-xz]
Saving to: '/home/berne/.kubler/downloads/portage-20220912.tar.xz'

/home/berne/.kubler/downloads/portage-20220912.tar.xz       100%[========================================================================================================================================>]  41.26M   418KB/s    in 1m 41s

2022-09-12 01:30:29 (417 KB/s) - '/home/berne/.kubler/downloads/portage-20220912.tar.xz' saved [43260048/43260048]

»»»»»[portage]» bootstrap kubler-gentoo/portage image
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 32.4s (9/9) FINISHED
 => [internal] load .dockerignore                                                                                                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                                                                                                         0.0s
 => [internal] load build definition from Dockerfile                                                                                                                                                                                    0.0s
 => => transferring dockerfile: 663B                                                                                                                                                                                                    0.0s
 => [internal] load metadata for docker.io/library/busybox:latest                                                                                                                                                                       5.1s
 => [auth] library/busybox:pull token for registry-1.docker.io                                                                                                                                                                          0.0s
 => [1/4] FROM docker.io/library/busybox:latest@sha256:20142e89dab967c01765b0aea3be4cec3a5957cc330f061e5503ef6168ae6613                                                                                                                 0.5s
 => => resolve docker.io/library/busybox:latest@sha256:20142e89dab967c01765b0aea3be4cec3a5957cc330f061e5503ef6168ae6613                                                                                                                 0.0s
 => => sha256:2c39bef88607fd321a97560db2e2c6d029a30189c98fafb75240db93c26633ad 773.28kB / 773.28kB                                                                                                                                      0.4s
 => => extracting sha256:2c39bef88607fd321a97560db2e2c6d029a30189c98fafb75240db93c26633ad                                                                                                                                               0.1s
 => [internal] load build context                                                                                                                                                                                                       0.8s
 => => transferring context: 43.29MB                                                                                                                                                                                                    0.8s
 => [2/4] COPY portage-20220912.tar.xz /                                                                                                                                                                                                0.2s
 => [3/4] COPY patches/ /patches                                                                                                                                                                                                        0.0s
 => [4/4] RUN set -x &&     mkdir -p /var/db/repos/ &&     xzcat /portage-20220912.tar.xz | tar -xf - -C /var/db/repos &&     mv /var/db/repos/portage /var/db/repos/gentoo &&     mkdir -p /var/db/repos/gentoo/metadata &&     rm /  26.1s
»»»»»[portage]» tag image kubler-gentoo/portage:latest
Error response from daemon: No such image: kubler-gentoo/portage:20220731T170548Z
»[✘]»[portage]» fatal: Failed to tag kubler-gentoo/portage:20220731T170548Z

They need to be either loaded into the local host registry by given an extra --load argument or pushed to the remote registry with the --push argument.

Fixing that the next problem is that loading the stage3, the image is available on the localhost but not in the buildx builder container.

»»»»»[portage]» tag image kubler-gentoo/portage:latest
»»»»»[portage]» create the portage container
»»»»»[kubler/bob-musl-core]» download stage3-amd64-musl-hardened-20220828T170542Z.tar.xz
»»»»»[kubler/bob-musl-core]» import stage3-amd64-musl-hardened-20220828T170542Z.tar.xz
sha256:cd04c50e814986e3c758ea1f875840c7a93878494f44594d9f2d6506b1c33f3d
»»»»»[kubler/bob-musl-core]» tag kubler-gentoo/stage3-amd64-musl-hardened:latest
»»»»»[kubler/bob-musl-core]» exec docker build -t kubler/bob-musl-core:20220731T170548Z
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 4.3s (4/4) FINISHED
 => [internal] load build definition from Dockerfile                                                                                                                                                                                    0.0s
 => => transferring dockerfile: 1.25kB                                                                                                                                                                                                  0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                                                                                                         0.0s
 => ERROR [internal] load metadata for docker.io/kubler-gentoo/stage3-amd64-musl-hardened:latest                                                                                                                                        4.2s
 => [auth] kubler-gentoo/stage3-amd64-musl-hardened:pull token for registry-1.docker.io                                                                                                                                                 0.0s
------
 > [internal] load metadata for docker.io/kubler-gentoo/stage3-amd64-musl-hardened:latest:
------
Dockerfile:1
--------------------
   1 | >>> FROM kubler-gentoo/stage3-amd64-musl-hardened
   2 |     LABEL maintainer="Erik Dannenberg <erik.dannenberg@xtrade-gmbh.de>"
   3 |
--------------------
ERROR: failed to solve: kubler-gentoo/stage3-amd64-musl-hardened: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
»[✘]»[kubler/bob-musl-core]» fatal: exec docker build -t kubler/bob-musl-core:20220731T170548Z

The local image needs to be loaded into the builder docker-container, but there isn't a straight forward way to do this.

NOTE: For multi-plaform images, --load won't work, as the locally registry doesn't yet understand multi-platform images. For multi-platform docker buildx build --push works, to upload it to remote registry.

If --push isn't used to build a mult-platform image, then a multi-platform image could be created by creating a manifest and combining several images for different platforms, e.g. by using docker buildx imagetools create.

Supporting buildx will help with #200.

r7l commented 1 year ago

Since you've pointed back to my issue, i might add that i am still not convinced using buildx. I haven't had a deeper look into it but i would assume that most of the time people will just use already prebuild images as parents. So basically just pull an ARM image of Alpine and go on from there. Is anyone actually building the OS from scratch with buildx? Even so, will those people use the exact same set of defaults for multiple architectures? Because that's pretty much what Kubler would currently do.

There are a couple issues building images for multiple platforms using Kubler and Gentoo. Maybe i haven't spend enough effort to it but for me it was the best option to just have multiple builders. So i have a builder for amd64 and another one for arm64. Setting build.conf (BOB_CHOST ...) differently is just the most obvious thing. Each architecture has a different stage3 download URL on Gentoo servers and so on. There will probably quite some code needed to automatically assign this out of the box when using buildx.

berney commented 1 year ago

Thanks for your thoughts. I think supporting both traditional Docker build and buildx (BuildKit) would be good.

IMO sooner or later Docker will switch to BuildKit and deprecate the old build and later remove it. So eventually this work will need to be done.

Currently if people have switched to buildx as default, via install, kubler breaks. So at a minimum it should be documented and instructions on how to work around it (use old builder).

I use Kubler to build images for a single purpose often with a single binary, and hence based off scratch as parent. I would like multiplatform versions of these. It is just several images bundled together by a manifest. I could do that outside of kubler in CI/CD or I could just use different names or tags, which is what you must be doing.

Yesterday I was working on using Kubler in CI/CD and caching the Docker images, and layers, would greatly increase the speed of the pipeline runs. This would be easy to achieve with buildx, if it worked in kubler.

Yesterday I realised that naming the builders etc kubler/* makes it not conducive to everyone building their own (gentoo style) and pushing them (owner/name conflicts). It's conducive to centrally built and everyone pulling (Docker style). For me I like that I build them (control). Thinking about it more today, if this was prefixed like ghcr.io/berney/kubler/bob-musl I could push without conflicts with others. And then multiple machines can leverage existing images/layers. E.g. CI/CD.

r7l commented 1 year ago

If Kubler supports buildx, i might give it a try. But even so, i still wonder how you'll manage to work with Kubler + Gentoo and the Docker approach to multiplatform support. I am sure it will be possible to tweak Kubler into working with the buildx way of dealing with mutiple platforms.

How do you plan to build the builders (bob) for each platform? I am currently using 2 builders and called them builder-amd64 and builder-arm64. Each with their own bob directories, configured for each architecture. Maybe it might be possible to curve around this by not using build.conf much and instead set the values in kubler.conf differently for each platform. I have not done this, since the both builders are slightly different anyways. I am not building as many images for arm64 as i am building for amd64. Mostly because i am only building what i am actually going to use.

I am using the Kubler images with CI/CD as well. But not the building process of Kubler itself. Instead i am building them manually and cache the final images in my registry using date tags. After using Kubler for quite some time, i don't trust the build process to be solid enough to not be a constant pain with a CI/CD on top. That's not Kublers fault at all. Rather the fact that Gentoo is always on the move with things and also that building into Docker is not really supported by Gentoo. So every other rebuild, i am running into some issues here and there. Then i need to step in to fix the image in question.

berney commented 1 year ago

I don't know yet, but I plan to work on it. I am hoping to minimise the differences between the images (the kubler definitions) and the builders, and just running buildx with multiple platforms (amd64 and arm64), and getting both working. But potentially I'll need dedicated configs or some patch/diff... I'll try to come up with a good solution.

I hear you regarding builds breaking often, due to gentoo and packages. That's part of why I want it in CI/CD though, so that it can be rebuilt periodically and if it breaks, the build will fail, and I'll be alerted and can fix it at my leisure rather than find out it broken when I want to use it.

First, I'm going to try to get single platform working on buildx, and then try to get multiplatform.

r7l commented 1 year ago

Looks like we are going to be forced to move to Buildkit sooner then later:

DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            Install the buildx component to build images with BuildKit:
            https://docs.docker.com/go/buildx/

I am getting this warning now when using Kubler.