Closed gonuke closed 1 year ago
This may be related to the BuildKit known issue mentioned in the ReadMe?
With more testing, I've confirmed that my Dockerfile fails locally when I turn off BuildKit, so maybe it's also partly my problem, although using BuildKit here would be convenient too.
Thanks for the feedback here @gonuke - I hadn't explored buildx
and perhaps it will lead to improvement with the buildkit caching issues. It sounds like a straightforward change to try!
For what it's worth, I've also run into situations where my build fails with buildkit disable; typically something silly like needing a trailing slash on a directory in a COPY
step. In the short-term, that (or something similar) may be a sufficient workaround.
I think I found the difference: the buildx
propagates build-arg
s across stages:
In case you care, here's my minimum working example:
ARG foo="FOO"
ARG bar=3
FROM ubuntu:20.04 AS stage1
RUN echo stage1 foo = ${foo}
FROM stage1 AS stage2
ARG foo
RUN echo stage2 foo = ${foo}
FROM stage2 AS stage3
RUN echo stage3 foo = ${foo}
Using buildx
the value of foo
is defined in stage2
& stage3
but NOT stage1
.
Without buildx
the value of foo
is only defined in stage2
and NOT in stage1
or stage3
I mostly fixed my immediate problem, but did find another reason to use BuildKit
/buildx
. Imagine that you have stages like the following:
FROM base AS stage1
...
FROM stage1 AS stage2A
...
FROM stage2A AS stage2B
...
FROM stage1 AS stage3A
...
FROM stage3A AS stage3B
.....
If you try to build only stage3B
it will force building of all the stages. Notably, stage2A
and stage2B
are not required for stage3B
but will be built anyway.
(https://docs.docker.com/build/building/multi-stage/#differences-between-legacy-builder-and-buildkit)
Agreed! I've done some periodic testing to see if the layer caching will behave with buildkit on, and as of the last attempt, it was still inconsistent at best; as a workaround, I've added the intermediate stages into the cache list which at least speeds things up a bit, but it's far from optimal.
At the moment I don't have a ton of spare bandwidth to test the buildx
approach, but if you want to have a go at it and send a PR I'd be more than happy to review :)
I definitely know the feeling of wanting a PR contributed 😁, but I'm not sure I have the bandwidth to learn enough about all of this to get it right or test it reliably.
Hey @gonuke - I had a little time today to make an attempt at this: #39. It took a ton of trial and error (since local behavior is, for whatever reason, a little different than GHA), but I think the head of that PR has a functionally-correct implementation.
I could definitely use a hand testing. I think the following procedure should work:
firehed/multistage-docker-build-action@ bc95fc29c4849d83e8473c7644900f559391d39c
quiet: false
to uses
(temporarily for debugging!)DOCKER_BUILDKIT=1
in your workflow or step's environment (example)git commit --allow-empty -m "trigger rebuild"
)If this works, I'd love your feedback on the interface to this - I don't particularly like the approach of setting and detecting an environment variable. I'm considering a new optional boolean input (e.g. parallel: true
) which will take care of things.
I've had quite a crazy week due to some dayjob stuff so I probably won't quite be able to wrap this up yet, but I'm thankful for your encouragement on this as it's been bothering me for a while too!
Thanks @Firehed. Pinging @bquan0 to take a look at this
Hi @Firehed, I ran some tests this week following the steps in this conversation, and it seems to be working with the functionalities of buildx (args propagating to later stages, not building stages that the final stage isn't dependent on). You can see more details in the PR above that I made in another repo.
Thanks @bquan0 and @gonuke! I'm about wrapping things up for the week but I'll hopefully be able to do one more quick round of testing and cleanup early next week, and then we can hopefully landed and tagged!
@Firehed - do you know of any failure modes that we should probe? Our testing was very simple, so may have overlooked some edge cases.
@gonuke Not to my knowledge; I wanted to get a few more builds running through the action to ensure that the layer cache is reliable, as all of the various --cache-from
flags become a bit more complex.
Note that on the branch, I made a small enhancement: you can now set parallel: true
in the options (with:
section) rather than having to mess with environment variables. I've been testing this on a project that should benefit significantly from parallel builds and it seems to work as expected.
I'm planning to land and tag the change (as v1.7.0
) soon, but wanted to give you a heads up in case you want to test or provide additional feedback on the format or docs.
So now the change looks roughly like this:
- name: Build images
id: build
- uses: firehed/multistage-docker-build-action@v1
+ uses: firehed/multistage-docker-build-action@c1bb0bf8b8dd37c020c4bc6d496851300f33e1d3
with:
repository: ghcr.io/your/repo
+ parallel: true
- stages: node-dependencies, mix, php-dependencies, server-setup, server
+ stages: node-dependencies, frontend-build
server-stage: webserver
(the final tagged version will not require the commit override - v1 will keep working once released)
Hi folks - just wanted to let you know that this is landed and now in a tagged release. v1.7 or later will support buildx
by using the parallel: true
option! Thanks again for your help testing and encouragement with this :)
I have a Dockerfile that builds successfully when using the
docker/build-push-action
action and fails when using thismultistage-docker-build-action
. The way that it fails is very subtle. In the former, acmake
command successfully finds a dependency while in the latter it does not.The only apparent difference between the two is that the
docker/build-push-action
is using thebuildx
builder and this multistage action is not.It may be as simple as changing the docker command from
"build"
to"buildx build"
infunction buildStage
. I am not sure how to test this independently.