Firehed / multistage-docker-build-action

Github Action for optimized multi-stage docker builds
10 stars 3 forks source link

Change to use docker buildx builder #36

Closed gonuke closed 1 year ago

gonuke commented 1 year ago

I have a Dockerfile that builds successfully when using the docker/build-push-action action and fails when using this multistage-docker-build-action. The way that it fails is very subtle. In the former, a cmake command successfully finds a dependency while in the latter it does not.

The only apparent difference between the two is that the docker/build-push-action is using the buildx builder and this multistage action is not.

It may be as simple as changing the docker command from "build" to "buildx build" in function buildStage. I am not sure how to test this independently.

gonuke commented 1 year ago

This may be related to the BuildKit known issue mentioned in the ReadMe?

gonuke commented 1 year ago

With more testing, I've confirmed that my Dockerfile fails locally when I turn off BuildKit, so maybe it's also partly my problem, although using BuildKit here would be convenient too.

Firehed commented 1 year ago

Thanks for the feedback here @gonuke - I hadn't explored buildx and perhaps it will lead to improvement with the buildkit caching issues. It sounds like a straightforward change to try!

For what it's worth, I've also run into situations where my build fails with buildkit disable; typically something silly like needing a trailing slash on a directory in a COPY step. In the short-term, that (or something similar) may be a sufficient workaround.

gonuke commented 1 year ago

I think I found the difference: the buildx propagates build-args across stages: In case you care, here's my minimum working example:

ARG foo="FOO"
ARG bar=3

FROM ubuntu:20.04 AS stage1
RUN echo stage1 foo = ${foo}

FROM stage1 AS stage2
ARG foo
RUN echo stage2 foo = ${foo}

FROM stage2 AS stage3
RUN echo stage3 foo = ${foo}

Using buildx the value of foo is defined in stage2 & stage3 but NOT stage1. Without buildx the value of foo is only defined in stage2 and NOT in stage1 or stage3

gonuke commented 1 year ago

I mostly fixed my immediate problem, but did find another reason to use BuildKit/buildx. Imagine that you have stages like the following:

FROM base AS stage1
...

FROM stage1 AS stage2A
...

FROM stage2A AS stage2B
...

FROM stage1 AS stage3A
...

FROM stage3A AS stage3B
.....

If you try to build only stage3B it will force building of all the stages. Notably, stage2A and stage2B are not required for stage3B but will be built anyway. (https://docs.docker.com/build/building/multi-stage/#differences-between-legacy-builder-and-buildkit)

Firehed commented 1 year ago

Agreed! I've done some periodic testing to see if the layer caching will behave with buildkit on, and as of the last attempt, it was still inconsistent at best; as a workaround, I've added the intermediate stages into the cache list which at least speeds things up a bit, but it's far from optimal.

At the moment I don't have a ton of spare bandwidth to test the buildx approach, but if you want to have a go at it and send a PR I'd be more than happy to review :)

gonuke commented 1 year ago

I definitely know the feeling of wanting a PR contributed 😁, but I'm not sure I have the bandwidth to learn enough about all of this to get it right or test it reliably.

Firehed commented 1 year ago

Hey @gonuke - I had a little time today to make an attempt at this: #39. It took a ton of trial and error (since local behavior is, for whatever reason, a little different than GHA), but I think the head of that PR has a functionally-correct implementation.

I could definitely use a hand testing. I think the following procedure should work:

If this works, I'd love your feedback on the interface to this - I don't particularly like the approach of setting and detecting an environment variable. I'm considering a new optional boolean input (e.g. parallel: true) which will take care of things.

I've had quite a crazy week due to some dayjob stuff so I probably won't quite be able to wrap this up yet, but I'm thankful for your encouragement on this as it's been bothering me for a while too!

gonuke commented 1 year ago

Thanks @Firehed. Pinging @bquan0 to take a look at this

bquan0 commented 1 year ago

Hi @Firehed, I ran some tests this week following the steps in this conversation, and it seems to be working with the functionalities of buildx (args propagating to later stages, not building stages that the final stage isn't dependent on). You can see more details in the PR above that I made in another repo.

Firehed commented 1 year ago

Thanks @bquan0 and @gonuke! I'm about wrapping things up for the week but I'll hopefully be able to do one more quick round of testing and cleanup early next week, and then we can hopefully landed and tagged!

gonuke commented 1 year ago

@Firehed - do you know of any failure modes that we should probe? Our testing was very simple, so may have overlooked some edge cases.

Firehed commented 1 year ago

@gonuke Not to my knowledge; I wanted to get a few more builds running through the action to ensure that the layer cache is reliable, as all of the various --cache-from flags become a bit more complex.

Note that on the branch, I made a small enhancement: you can now set parallel: true in the options (with: section) rather than having to mess with environment variables. I've been testing this on a project that should benefit significantly from parallel builds and it seems to work as expected.

I'm planning to land and tag the change (as v1.7.0) soon, but wanted to give you a heads up in case you want to test or provide additional feedback on the format or docs.

So now the change looks roughly like this:

       - name: Build images
         id: build
-        uses: firehed/multistage-docker-build-action@v1
+        uses: firehed/multistage-docker-build-action@c1bb0bf8b8dd37c020c4bc6d496851300f33e1d3
         with:
           repository: ghcr.io/your/repo
+          parallel: true
-          stages: node-dependencies, mix, php-dependencies, server-setup, server
+          stages: node-dependencies, frontend-build
           server-stage: webserver

(the final tagged version will not require the commit override - v1 will keep working once released)

Firehed commented 1 year ago

Hi folks - just wanted to let you know that this is landed and now in a tagged release. v1.7 or later will support buildx by using the parallel: true option! Thanks again for your help testing and encouragement with this :)