buildkite / elastic-ci-stack-for-aws

An auto-scaling cluster of build agents running in your own AWS VPC
https://buildkite.com/docs/quickstart/elastic-ci-stack-aws
MIT License
417 stars 274 forks source link

Outdated buildkit version on newer cloudformation stacks #1398

Open jaimebarriga opened 1 week ago

jaimebarriga commented 1 week ago

Describe the bug

We recently upgraded our Cloudformation stack from 6.21 to 6.30 to be able to get this update which upgrades buildx to 0.15 so that we can get buildkit v0.14. We did this as we were affected by the buildkit cache issue this PR was attempting to fix.

I upgraded, however I am still seeing the buildkit caching issue. Did some digging and found that the BuildKit version is still an older version (0.12.5) even though I could confirm that we were now using buildx v0.15

From an older instance (Stack version: v6.21)

> docker info
Client:
 Version:    25.0.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

> docker buildx ls
NAME/NODE     DRIVER/ENDPOINT   STATUS    BUILDKIT   PLATFORMS
default*      docker
 \_ default    \_ default       running   v0.12.5    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

> docker buildx inspect
Name:   default
Driver: docker

Nodes:
Name:             default
Endpoint:         default
Status:           running
BuildKit version: v0.12.5
Platforms:        linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

From a newer instance (Stack version: v6.30)

> docker info
Client:
 Version:    25.0.5
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.15.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

> docker buildx ls
NAME/NODE     DRIVER/ENDPOINT   STATUS    BUILDKIT   PLATFORMS
default*      docker
 \_ default    \_ default       running   v0.12.5    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

> docker buildx inspect
Name:   default
Driver: docker

Nodes:
Name:             default
Endpoint:         default
Status:           running
BuildKit version: v0.12.5
Platforms:        linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

Attempts to Fix

I tried getting around this by creating my own builder and using that as a default

docker buildx create --name custombuilder --driver docker-container --driver-opt image=moby/buildkit:v0.14.0 --use
docker buildx inspect custombuilder --bootstrap

When running this manually it works

> docker buildx ls
NAME/NODE            DRIVER/ENDPOINT                   STATUS    BUILDKIT   PLATFORMS
custombuilder*       docker-container
 \_ custombuilder0    \_ unix:///var/run/docker.sock   running   v0.14.0    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
default              docker
 \_ default           \_ default                       running   v0.12.5    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

> docker buildx inspect
Name:          custombuilder
Driver:        docker-container
Last Activity: 2024-11-13 19:07:17 +0000 UTC

Nodes:
Name:                  custombuilder0
Endpoint:              unix:///var/run/docker.sock
Driver Options:        image="moby/buildkit:v0.14.0"
Status:                running
BuildKit daemon flags: --allow-insecure-entitlement=network.host
BuildKit version:      v0.14.0
Platforms:             linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

However, if I added this exact same code in my bootstrap script, it runs (I checked the logs), but when I SSH into the instance and check the builder, there is no trace of my custombuilder

> docker buildx ls
NAME/NODE     DRIVER/ENDPOINT   STATUS    BUILDKIT   PLATFORMS
default*      docker
 \_ default    \_ default       running   v0.12.5    linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

> docker buildx inspect
Name:   default
Driver: docker

Nodes:
Name:             default
Endpoint:         default
Status:           running
BuildKit version: v0.12.5
Platforms:        linux/arm64, linux/amd64, linux/amd64/v2, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

Any ideas how I can fix this?

Expected behavior I expected the buildkit version to be v0.14

Actual behaviour It is instead v0.12.5

Stack parameters (please complete the following information):

Additional context Add any other context about the problem here.

DrJosh9000 commented 3 days ago

Thanks @jaimebarriga, that's confusing. I'll try to replicate and get back to you.