buildkite-plugins / docker-compose-buildkite-plugin

🐳⚡️ Run build scripts, and build + push images, w/ Docker Compose
MIT License
171 stars 138 forks source link

Bad tag separator strikes again #410

Closed toothbrush closed 6 months ago

toothbrush commented 10 months ago

I just got bitten by what looks like the same issue as https://github.com/buildkite-plugins/docker-compose-buildkite-plugin/issues/352 which led me to https://github.com/buildkite-plugins/docker-compose-buildkite-plugin/pull/353.

This is my pipeline:

steps:
  - name: ":docker: Build"
    plugins:
      - docker-compose#v4.14.0:
          # cli-version: 2 # Broken unless i set cli-version: 2
          push:
            - app:myorg/${BUILDKITE_PIPELINE_SLUG}:${BUILDKITE_BUILD_NUMBER}

Here's what happens:

~~~ Running plugin docker-compose command hook
$ /var/lib/buildkite-agent/plugins/github-com-buildkite-plugins-docker-compose-buildkite-plugin-v4-14-0/hooks/command
$ docker-compose -f docker-compose.yml -p buildkite018b4b17238a4a8a95d3497da7cdfedf config
~~~ :docker: Building app
$ docker-compose -f docker-compose.yml -p buildkite018b4b17238a4a8a95d3497da7cdfedf build app
#1 [app internal] load build definition from Dockerfile
#1 transferring dockerfile: 109B done
#1 DONE 0.0s

[... bunch of building...]

#6 [app] exporting to image
#6 exporting layers done
#6 writing image sha256:9fab1a6b4f3fb7ca52968d91c762fc55fa84aa9191a7ebf10547772f06d5de58 done
#6 naming to docker.io/library/buildkite018b4b17238a4a8a95d3497da7cdfedf-app done
#6 DONE 0.0s

~~~ :docker: Pushing image myorg/app-template
$ docker tag buildkite018b4b17238a4a8a95d3497da7cdfedf_app myorg/app-template
Error response from daemon: No such image: buildkite018b4b17238a4a8a95d3497da7cdfedf_app:latest

The eagle-eyed reader will see a problem: ...97da7cdfedf-app versus ...97da7cdfedf_app as the image name.

I haven't found where this default is coming from (i'll investigate and update the issue), but it seems our Buildkite agents are defaulting to CLI version 2 whereas the plugin is assuming we're on v1. However, specifying cli-version: 2 fixed the issue. I'm wondering if i should try and cook up a PR that either robustly works out which CLI version we're using, or working out the name of the image we just built.

toothbrush commented 10 months ago

Looking at one of our agents, here is what i get:

[ssm-user@ip-172-18-13-243 bin]$ docker-compose --version
Docker Compose version v2.20.2

So it looks like docker-compose (which is the command the plugin uses here) turns into v2 on our system.

I did a bit more digging, and it turns out that docker-compose on our Buildkite boxes is actually a script which essentially forwards all arguments to docker compose --compatibility "$@", so i guess the version reporting v2 isn't a surprise. Perhaps something weird is happening such that --compatibility doesn't change the - or _ tag separator discrepancy 🤔

toote commented 7 months ago

That is really weird as it looks like your installation does not respect the compatibility flag whatsoever because the image is getting built with a - in its name:

6 naming to docker.io/library/buildkite018b4b17238a4a8a95d3497da7cdfedf-app done

In theory, --compatibility should change that to buildkite018b4b17238a4a8a95d3497da7cdfedf-app and that is what the plugin assumes through the replication of the corresponding information.

In #418 I tried simplifying the code through the use of config --images but not only that flag is only available from 2.20+ (so not all CLI v2 would have it) but also it prints images for the service and its dependencies and not just a single specified service. Which means that this plugin still needs to replicate the compose behaviour of using - in v2 and _ in v1 or v2 with compatibility. :shrug:

toote commented 6 months ago

While working on #420 we were able to duplicate this issue exactly. The issue was that the plugin defaults to cli V1 (docker-compose) but the docker installation in the agent running the job actually had cli V2 (docker compose) installed with a V1 wrapper that forwarded calls to the V2 CLI without the compatibility flag.

That means that the plugin assumed V1 but internally it was running V2 so the code to calculate the automatic image names did the wrong thing due to the messy setup.

Luckily, the upcoming release that will default to V2 should completely eliminate this issues as I can not think of an installation that would create a docker plugin called compose to wrap V2 calls and actually call cli V1.