Open jcampbell05 opened 2 years ago
Hello @jcampbell05 - the platform
is appended as provided and passed to docker build
command as --platform
flag - could you share a bit more about the output and the final image that is produced? Does your runner actually support building for arm64?
So essentially it builds as normal without any issues, only throwing an error that it's the wrong format when ran on lambda or if I force docker to pull down the arm64 base image I get a format error during the build step and spitting out that the host doesn't match the docker base image platform and that "platform" flag wasn't passed in.
As far as I'm aware the default Github runner should support building arm images or it should at least be possible to build using buildx.
I did an experiment where I built the image using my own script and encountered the same issue so t seems like it could be an environment issue and not a problem with serverless framework.
Okay on inspection the issue was two things;
I'm happy to add number 2 if it doesn't exist as a flag ?
So I have a fix in my fork for serverless. Essentially serverless checks to see if DOCKER_USE_BUILDX is set to a truthy value and if it is then it will use docker buildx build
and if not it will continue to use docker build
.
This allows it to be opt in for people in my case who need it but continue to use the stable build command for the majority of users.
If this sounds good I can open a PR @pgrzesik
Hello @jcampbell05 - thanks a lot for providing more details and clarifications.
As for the solution - I'm honestly not sure about using an env variable instead of a setting in Serverless configuration e.g. under provider.ecr
or specifically for each of the images.
@mnapoli @medikoo what is your opinion on that?
build --platform
seems to be supported since a specific Docker version, is that correct? Does that mean that if you upgrade Docker, then things should work? (is that even possible on GitHub Actions?)
Basically I'm wondering whether we should add an option here. What would it take for things to work correctly by default?
Alternatively, could we (should we?) switch to buildx
by default then?
build --platform seems to be supported since a specific Docker version, is that correct? Does that mean that if you upgrade Docker, then things should work? (is that even possible on GitHub Actions?)
That would not be enough - because e.g. if your runner is amd64, then it still might not be possible to build arm64 images and vice versa.
Alternatively, could we (should we?) switch to buildx by default then?
I don't think so as it's a "plugin" that most people will probably not use at all so the command won't even be available.
That would not be enough - because e.g. if your runner is amd64, then it still might not be possible to build arm64 images and vice versa.
Wait, are you saying this is specifically for GitHub Actions, or are you saying that docker build
does not support at all cross-platform builds? (I wonder what the option is for then)
I'm trying to sum up the problem entirely, before we add new options. I.e. in which scenarios specifically is the current feature limited.
So I've dug into this issue and looked into all of the Docker documentation and I think I've figured out the issue and solution. Unfortunately Docker hasn't consolidated it in one easy to understand place so I will link through out the comment to the useful resources.
Github Action's default Linux runner uses a AMD processor and Ubuntu 20.04 (https://github.com/actions/virtual-environments) which by default has Docker Moby 20.10.14 and Docker BuildX 0.8.2 installed (https://github.com/actions/virtual-environments/blob/main/images/linux/Ubuntu2004-Readme.md). This version supports the --platform
argument.
According to their documentation this platform for the docker client only requests from the docker server a particular architecture when pulling or building an image but doesn't guarantee that it will respect it if the server hasn't been configured to support it (https://docs.docker.com/engine/reference/commandline/build/).
By default Docker Desktop for Mac & Windows supports multiple architectures out of the box because it has been setup to ship with BuildX, QEMU and binfmt_misc to run binaries of any architecture (https://www.docker.com/blog/multi-arch-images/) as well as configured BuildX to be the default builder meaning "docker build" is actually an alias of "docker buildx build".
Linux Docker requires the user to configure it to support this and uses the standard build command by default which only builds images for the architecture of the host system and will ignore the platform flag.
There are three ways to configure it (https://docs.docker.com/buildx/working-with-buildx/) but the easiest way to configure it is as follows:
By default BuildX is installed when Docker is installed via the Deb or RPM packages (https://docs.docker.com/buildx/working-with-buildx/#linux-packages).
From what I can see Github Actions has only achieved number 1 since they leave it up to the user how they want to achieve multi-platform builds. So the user has to perform steps 2 & 3 to build for other architectures themselves (It looks like it hasn't even been aliased).
Docker has helpfully provided Github Actions to install QEMU (https://github.com/docker/setup-qemu-action) which leaves the user to handle step number 3.
Since Serverless currently doesn't support being told to use BuildX we have to set it up as an alias by running "docker buildx install" which should solve the issue and avoid making any changes to the Serverless framework (https://docs.docker.com/engine/reference/commandline/buildx_install/)
However to make this a better developer experience and to work out of the box I would suggest the following tweaks:
docker buildx install
. If it isn't installed then the user will get an error telling them to install it. Right now because Serverless does no validation and just uses docker build
in all situations it fails to build containers for other platforms to the host when BuildX isn't configured to be the default builder and not configured correctly.
The above should require no additional environmental variables or changes to the Serverless configuration file. Instead improving the existing documentation and behaviour by the Framework.
Thanks a lot @jcampbell05 for providing such an extensive overview π
+1, thanks for the details!
Having Serverless Framework automatically install buildx sounds useful right now, but I would be afraid it might be a liability over time? What if buildx is no longer needed in a few months from now (because Docker now support it natively)? Are we (Framework maintainers) in a capacity to 100% follow these changes and own the "buildx" specific logic? Those would be my questions so far.
That's why I'm wondering: would it make sense to not do anything automatically, but instead focus on guiding users through docs and actionable error messages? (which is part of what you suggest)
And +1 on the questions of @pgrzesik above
2. Could you clarify how that should work? How should the detection of proper architecture be implemented in your mind?
So we can get all the information about the host platform using docker version
and it should spit out the host architecture.
Here is an example of what mine prints:
Client:
Cloud integration: 1.0.17
Version: 20.10.8
API version: 1.41
Go version: go1.16.6
Git commit: 3967b7d
Built: Fri Jul 30 19:55:20 2021
OS/Arch: darwin/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.8
API version: 1.41 (minimum version 1.12)
Go version: go1.16.6
Git commit: 75249d8
Built: Fri Jul 30 19:52:10 2021
OS/Arch: linux/amd64
Experimental: false
We can simply detect if OS/Arch: for the Server matches the platform provided by the user in serverless.yml
3. What would be the exact steps to verify the installation with the least possible overhead?
For BuildX docker buildx version
is all that would be needed to detect it's installed. If we get a version number then it's installed but if we get a "buildx is not a command" error then it's not installed.
For QEMU we can detect if qemu-img is a valid command using "which qemu-img" as this is a tool installed by QEMU for detecting image formats.
Having Serverless Framework automatically install buildx sounds useful right now, but I would be afraid it might be a liability over time? What if buildx is no longer needed in a few months from now (because Docker now support it natively)? Are we (Framework maintainers) in a capacity to 100% follow these changes and own the "buildx" specific logic? Those would be my questions so far.
So it wouldn't be installed by Serverless but rather detect it and used if available.
It's not clear yet if Docker ever intends to merge it into the main build command as it also as features such as targeting a remote runner to build a container which might make the out of the box build command a bit heavy. It appears that their intention is for buildx to become an "Advanced" builder.
I would certainly be surprised if it needed any special logic in the future as the fact you can alias to it shows to me that they intend for it to be a drop in replacement for existing scripts. So I would expect for them to provide backwards compatible upgrade path we can utilise.
But If we are afraid it might become a liability then just updating the documentation and throwing an error messages when the above conditions are met to the user would be a big improvement on what we have now.
But I don't see that being a huge reduction in liability for the framework developers as you would still need to update the logic for when the error message should be displayed and so potentially there should be a way to skip this error message in case it no longer applies.
On the other hand that might be a premature optimisation and Docker should hopefully provide docker buildx
for backwards compatibility if they do one day plan to merge it.
So with all the above considered perhaps it would just be simpler to either only update the docs or in addition add a use_buildx option ?
Hello @jcampbell05 - I think we should keep the Framework inference into the build process minimal and potentially just detect if the host architecture is different and present a warning to user about that - I would be hesitant to throwing any errors or more extensive intervention as it might be intended by user and already handled by e.g. aliasing docker bulild
with buildx
. That information should provide potential guidance on how to address the issue if at a later point, the user would run into architecture mismatch
Sounds good, so to summarise:
Hey @jcampbell05 - sorry for not responding sooner, I was off this past week. That summary sounds good to me, what do you think @mnapoli ?
Yes that's good to me.
So following up on this I've found that the way BuildX can be configured to build multi-architecture containers can prevent images automatically being available for Serverless to be able to tag and push them.
When using the standard docker builder this automatically happens which is what serverless currently relies on. But if using Build X's docker container, kubenetes or any other build plugin it does not do this unless you pass the --load
parameter. (https://docs.docker.com/build/building/drivers/docker-container/#loading-to-local-image-store)
I had to alias the "docker" command to a custom script which injects this flag since serverless fails to use this flag and there currently is no way to pass your own command line flags for the build command
This script is below
#!/bin/sh
#
# Note:
#
# A Docker shim to workaround fact Serverless doensn't specify "--load" flag to automatically load images into image list when build using
# another Build X builder plugin other than the default docker builder nor provide anyway for us to provide our own flags.
#
# See more here https://github.com/serverless/serverless/issues/10982
#
# This shim simply calls the original docker with the flags serverless originally provided alongside our additional flags to fix the building
# issue and any other Github Action optimizations.
#
if [ $1 = "build" ]
then
/usr/bin/docker $@ --load
else
/usr/bin/docker $@
fi
Hey @jcampbell05 Can you show me how you used that script as an alias in github actions? I have been trying for a while and it doesn't seem like the alias for docker is being carried over to the serverless process.
Edit:
Nevermind figured it out! Ended up copying the shim to a place in PATH before /usr/bin:
copy docker_shim.sh ~/.local/bin/docker
@rajp33 I used the Github Actions script to add the docker script to the PATH environment variable so that all docker commands invoke it.
Are you certain it's a bug?
Is the issue caused by a plugin?
Are you using the latest version?
Is there an existing issue for this?
Issue description
When specifying a platform for my image it fails to have any effect on Github actions instead being built for linux/amd64 instead of my specified platform.
The image builds fine since it's a multi-arch base image but when ran it fails since the lambda is expecting a arm64 container and gets a amd64.
Service configuration (serverless.yml) content
Command name and used flags
npx sls deploy --stage dev
Command output
Environment information