PowerDNS / pdns

PowerDNS Authoritative, PowerDNS Recursor, dnsdist
https://www.powerdns.com/
GNU General Public License v2.0
3.69k stars 906 forks source link

Support for Multi-Arch Docker Images #10024

Open james-crowley opened 3 years ago

james-crowley commented 3 years ago

Short description

I would like to see the support for multi-arch images as PowerDNS starts to build out their Docker image offerings. Specifically, support for s390x and ppc64le would be extremely helpful.

Usecase

I want to be able to deploy and utilize PowerDNS in any environment/system without worrying about the architecture I am running on. I do a lot of work on Linux on Z(s390x) and having a s390x image would be helpful.

Description

I helped multiple open source communities with enabling multi-arch support for their Docker images using manifest. There are a couple build tools that make the heavy lifting easy but the Dockerfiles need to be design to be architecture aware.

I saw two issues/PRs open talking about multi-arch support:

10016

8655

@Habbie, you seem to be involved a lot in the Docker development and also enabling multiple architectures. Would supporting s390x and ppc64le be something you could be open too?

Additionally, if you need access to s390x and ppc64le resources there are multiple ways I can assist with that. I did see mentioned of Travis CI in, #8655 , but do not see a lot use of the CI/CD in this repo. How are builds currently be handled?

Habbie commented 3 years ago

I saw two issues/PRs open talking about multi-arch support:

As discussed on #10016, those are not actually about the Docker images, although the work sometimes interleaves a bit!

There are a couple build tools that make the heavy lifting easy but the Dockerfiles need to be design to be architecture aware.

Help with that would be much appreciated.

@Habbie, you seem to be involved a lot in the Docker development and also enabling multiple architectures.

Yes, both things somehow ended up mostly on my plate.

Would supporting s390x and ppc64le be something you could be open too?

Yes, although it depends on the effort from our side, and the maintenance burden. Somewhat related to that, one of the goals of the work in #10016 (that helps us migrate armhf builds off of qemu) is reducing our build times from six hours (armhf in qemu) to 20-30 minutes (armhf on hardware arm64), and we'd prefer that adding platforms does not bring back the six hour builds.

Additionally, if you need access to s390x and ppc64le resources there are multiple ways I can assist with that.

That would definitely solve the 'qemu' problem!

I did see mentioned of Travis CI in, #8655 but do not see a lot use of the CI/CD in this repo. How are builds currently be handled?

Complete story: https://github.com/PowerDNS/pdns/wiki/Automatic-testing

In short: we test on CircleCI and GH Actions; we build packages on our Buildbot; Docker images are autobuilt by the Docker hub. We have removed all traces of Travis (except on older release branches) because of quality issues with the service.

james-crowley commented 3 years ago

@Habbie I am happy to point you to those tools. Some scripts are from internal use, some are used in other open source projects I contributed too, and some are from the AdoptOpenJDK project. Let me grab the links and send them to you.

Building on architecture is something I always try to encourage but it can be hard due to lack of resources. As you experienced emulation can cause slow downs and on rare occasions when working with low level languages like C you can have serious issues with builds.

Looks like you have added a worker to your Buildbot server, https://builder.powerdns.com/#/workers ? Was this a server from AWS offering mentioned in #10016? If so, we could do a similar thing and give PowerDNS access to s390x and ppc64le systems running Linux. You can add them as workers to your Buildbot server and use them as you would like. If that interests you let me know and I can point you to portal for the free resources. Hopefully that helps with the maintenance concerns.

As of right now, there is no native integration with GitHub actions for ppc64le and s390x. It is something we are aware off and working towards fixing. As far CircleCI, there might be something in the works, so keep your eye out!

Lastly, to get some testing going on s390x and ppc64le should I be looking at https://github.com/PowerDNS/pdns-builder/tree/6176c5f68354ca82814ef20a4c87327785157ff3 for build scripts?

Habbie commented 3 years ago

@Habbie I am happy to point you to those tools. Some scripts are from internal use, some are used in other open source projects I contributed too, and some are from the AdoptOpenJDK project. Let me grab the links and send them to you.

Thanks!

Building on architecture is something I always try to encourage but it can be hard due to lack of resources. As you experienced emulation can cause slow downs and on rare occasions when working with low level languages like C you can have serious issues with builds.

Yes, we'd rather never go the qemu route again. We have some informal periodic builds on OpenBSD/arm64 and MacOS on M1, and they do find things. What you call 'serious issues with builds in low level languages' we do like to call 'bugs caught early' :D

Looks like you have added a worker to your Buildbot server, https://builder.powerdns.com/#/workers ?

I briefly added an arm64 worker there yesterday, then removed it again because our configuration was not ready for it.

Was this a server from AWS offering mentioned in #10016?

The arm64 worker that we had there briefly was an AWS instance; we have not signed up for the free credits for that yet, but we might.

If so, we could do a similar thing and give PowerDNS access to s390x and ppc64le systems running Linux. You can add them as workers to your Buildbot server and use them as you would like. If that interests you let me know and I can point you to portal for the free resources. Hopefully that helps with the maintenance concerns.

I kind of saw this coming from your first message. It would be much appreciated.

As of right now, there is no native integration with GitHub actions for ppc64le and s390x. It is something we are aware off and working towards fixing.

Ah! I shared with @pieterlexis the expectation that we could add them as GH Action Runners - can you explain why we couldn't do that today?

As far CircleCI, there might be something in the works, so keep your eye out!

Please keep us posted :)

Lastly, to get some testing going on s390x and ppc64le should I be looking at https://github.com/PowerDNS/pdns-builder/tree/6176c5f68354ca82814ef20a4c87327785157ff3 for build scripts?

There are two things.

(1) we have docker images, built from Dockerfile-* in the root of our repo. Using them should be as simple as docker build -f Dockerfile-auth . (plus the other two products). This would build on your 'default' architecture; if you wanted to build for something else (presumably by using qemu, or in the ARM case, running arm32 on arm64 natively), you'd edit the FROM lines. (2) packages are built by checking out pdns.git, running git submodule init and git submodule update (this will pull in pdns-builder), then running builder/build.sh, reading the instructions, and deciding to run something like builder/build.sh -m authoritative debian-buster. As you mentioned in another thread, the docker build used internally for that (entirely unrelated to (1) ! ) should just take the present platform as default. #10016 also allows you to be explicit about the target platform by running something like builder/build.sh debian-buster-arm64, and expanding that to other platforms should be obvious I hope (from #10016).

There should be no need to change anything in pdns-builder for supporting different architectures, at least the way we are doing it now.

james-crowley commented 3 years ago

@Habbie Do you have an email I can start the process getting you s390x and ppc64le resources? I saw an email listed on your GitHub profile but wanted to double check that was current. If you don't want to post the email here, feel free to email me at the email listed on my GitHub profile.

I took a look at your current GitHub actions attached to this repo, and I am not 100% sure how you are utilizing them. Could you clarify your process?

GitHub Actions should work if you are using it to connect to another service to build Docker images. Like spinning up a VM on another cloud provider or calling another CI/CD platform like CircleCI/Traivs. As it currently stands the GitHub runner/agent to add your own workers to GitHub actions is coded .NET which does not support s390x and ppc64le. So IBM can directly make s390x and ppc64le resources available on the platform for users.

Thanks for breaking down the process of the Docker builds and how packages are made! I am going to spend sometime today/tomorrow to confirm both work. And if not, I'll post back with some needed tweaks or hacks to enable s390x and ppc64le support.

Thanks for all the help so far!

Habbie commented 3 years ago

@Habbie Do you have an email I can start the process getting you s390x and ppc64le resources? I saw an email listed on your GitHub profile but wanted to double check that was current. If you don't want to post the email here, feel free to email me at the email listed on my GitHub profile.

That email is current but private, please use peter.van.dijk@powerdns.com for this.

(will respond to the rest later)

james-crowley commented 3 years ago

Thanks for the email. I totally forgot to link you to some the build tools/scripts:

https://github.com/jordan-cartwright-ibm/example-docker-manifest/tree/master/.ci

This is one of the scripts we use in the open source community. Helps break down how multi-arch images are built and handled. While the example has Travis CI, it was made to be able to work with any CI/CD. @jordan-cartwright-ibm can answer any questions you might have. This is what I based some of my contributions to the Jenkins Docker project off of.

There is also Buildx, https://docs.docker.com/buildx/working-with-buildx/, but it is still in "beta". Plus the default uses qemu for building the images. The documentation hints at allowing you to list other workers for that, but I have not successfully gotten that to work.

Another option is the official Docker Image build CI/CD which has access to a bunch of different architectures. But that would require PowerDNS to become an official image on DockerHub.

james-crowley commented 3 years ago

@Habbie running into some issues build the Dockerfile, Dockerfile-auth. First, on s390x the package libluajit-5.1-dev is not available in debian or ubuntu. This seems to be linked to the lack of a JIT for Lua. More information can be found here:

The error is:

Correcting dependencies...Starting pkgProblemResolver with broken count: 1
Starting 2 pkgProblemResolver with broken count: 1
Investigating (0) pdns-build-deps:s390x < 1.0 @iU mK Nb Ib >
Broken pdns-build-deps:s390x Depends on libluajit-5.1-dev:s390x < none @un H >
  Removing pdns-build-deps:s390x because I can't find libluajit-5.1-dev:s390x
Done
 Done
Starting pkgProblemResolver with broken count: 0
Starting 2 pkgProblemResolver with broken count: 0
Done

........

Processing triggers for libc-bin (2.28-10) ...
mk-build-deps: Unable to install pdns-build-deps at /usr/bin/mk-build-deps line 416.
mk-build-deps: Unable to install all build-dep packages

I spoke with the developer of that PR. He mentioned building LuaJIT with that patch should work. Oddly enough other distros seems to have pulled in the unofficial patches like OpenSUSE and Fedora.

That is as far as I got with s390x. I have not tried building LuaJIT with the patch yet. Since I hit that blocker, I switch over to ppc64le for a second. It looks like libluajit-5.1-dev is available for ppc64le. Seems like you are aware of some of the issues https://github.com/PowerDNS/pdns/issues/8655#issuecomment-696037838.

I got a little further in the build for ppc64le but the configure command on:

https://github.com/PowerDNS/pdns/blob/2d8d7e080c2482649c17e2ea34560466bed1cc50/Dockerfile-auth#L45-L52

Throws an error:

checking for mysql_config... (cached) /usr/bin/mysql_config
checking for odbc_config... no
checking for unixODBC library directory... configure: error: Did not find the unixodbc library dir in '/usr/local/unixodbc/lib/unixodbc /usr/local/lib/unixodbc /opt/unixodbc/lib/unixodbc         /usr/lib/unixodbc /usr/lib64/unixodbc /usr/local/unixodbc/lib /usr/local/lib /opt/unixodbc/lib /usr/lib         /usr/sfw/lib/ /usr/lib/odbc /usr/lib/x86_64-linux-gnu NONE/lib'

From the error it seems to be missing a needed package unixODBC. Not sure why this is missing, maybe its installed by default on x86 but not on ppc64le? Either way I am going to try to add unixodbc to the list of packages installed on step 3 in the Docker build. Hopefully that will fix that error and we can get to the make.

Habbie commented 3 years ago

@Habbie running into some issues build the Dockerfile, Dockerfile-auth. First, on s390x the package libluajit-5.1-dev is not available in debian or ubuntu. This seems to be linked to the lack of a JIT for Lua. More information can be found here:

We're well aware of this one - on arm64, it is available but broken. Our debian package builds have this:

              libluajit-5.1-dev [!arm64],
               liblua5.3-dev [arm64],

Indeed our Dockerfile, that we only tested on amd64, does not make that distinction. It looks like both the Dockerfile and the Debian/Ubuntu (and RPM!) packaging should be aware of the right choice for more platforms.

From the error it seems to be missing a needed package unixODBC. Not sure why this is missing, maybe its installed by default on x86 but not on ppc64le? Either way I am going to try to add unixodbc to the list of packages installed on step 3 in the Docker build. Hopefully that will fix that error and we can get to the make.

The step with mk-build-deps should pull that in. I can't immediately explain why it does not. Adding it to a step explicitly will be useful for debugging but is not the right fix.

Habbie commented 3 years ago

I took a look at your current GitHub actions attached to this repo, and I am not 100% sure how you are utilizing them. Could you clarify your process?

If you're asking about Docker images, we currently build those on the Docker Hub, but that limits us to amd64 (unless we accept qemu, which I don't like and also I don't think is fair on their free build service). With the addition of arm64, and pending this work, more platforms, we think it would make sense to move Docker image builds to GH Actions or Buildbot, with a preference for GH Actions I think.

GitHub Actions should work if you are using it to connect to another service to build Docker images. Like spinning up a VM on another cloud provider or calling another CI/CD platform like CircleCI/Traivs. As it currently stands the GitHub runner/agent to add your own workers to GitHub actions is coded .NET which does not support s390x and ppc64le. So IBM can directly make s390x and ppc64le resources available on the platform for users.

Ah yes, I forgot about the runner requiring .NET. We'll have to see what makes the most sense for integration there.

james-crowley commented 3 years ago

@Habbie I think GitHub actions might be a good idea to move too. Even then I think there probably exists a BuildBot action to kick of a remote job. So if you can build for amd64 and arm64 natively on GitHub actions great. But if you need to call a remote job to build ppc64le and s390x that should hopefully work as well.

The step with mk-build-deps should pull that in. I can't immediately explain why it does not. Adding it to a step explicitly will be useful for debugging but is not the right fix.

I tried adding unixODBC like I mentioned to the apt-get install list but even then the command still fails. I am not sure why. Any ideas on how to disable/bypass or fix the problem? I want to keep trying to hack away at getting a s390x and ppc64le build.

I saw where you disabled the build dependency for libluajit-5.1-dev. I am going to add a local change to do the same for s390x, so I can get passed that blocker. Hopefully I do not run into the unixODBC error on s390x as well.

Lastly, I know it the Dockerfile you had a couple comments about pulling in a tarball instead of building from source. Are you planning on moving that way for installs in Docker? I think installing a tarball thats pre-built might be a cleaner solution for the Docker installs. Or even pulling down and install the package via the OS's package manager. This way the install process from Docker and standard bare metal installs are similar.

Habbie commented 3 years ago

Lastly, I know it the Dockerfile you had a couple comments about pulling in a tarball instead of building from source. Are you planning on moving that way for installs in Docker? I think installing a tarball thats pre-built might be a cleaner solution for the Docker installs. Or even pulling down and install the package via the OS's package manager. This way the install process from Docker and standard bare metal installs are similar.

The comments are about source tarballs vs. the git tree. We have 3 products in our git tree (auth, dnsdist, recursor) that we distribute as 3 tarballs, where the dnsdist and recursor tarballs have a slightly different structure than the git tree. Right now, you cannot build dnsdist or recursor Docker images from their source tarballs, only from the git tree.

The Docker images do not use our packages repositories on purpose, so that users can use the Dockerfiles to easily build custom or patched software.

Habbie commented 3 years ago

Any ideas on how to disable/bypass or fix the problem?

You could try removing godbc from the configure line.

james-crowley commented 3 years ago

@Habbie Thanks for the tip. I'll remove godbc to keep going. I added in the lua5.3 support for s390x since there is no JIT.

Ran into the same error with unixODBC.

checking for mysql_config... /usr/bin/mysql_config
checking for GEOIP... yes
checking for MMDB... yes
checking for YAML... yes
checking for mysql_config... (cached) /usr/bin/mysql_config
checking for odbc_config... no
checking for unixODBC library directory... configure: error: Did not find the unixodbc library dir in '/usr/local/unixodbc/lib/unixodbc /usr/local/lib/unixodbc /opt/unixodbc/lib/unixodbc         /usr/lib/unixodbc /usr/lib64/unixodbc /usr/local/unixodbc/lib /usr/local/lib /opt/unixodbc/lib /usr/lib         /usr/sfw/lib/ /usr/lib/odbc /usr/lib/x86_64-linux-gnu NONE/lib'

@kpfleming Did you run into similar issues build for arm64?

james-crowley commented 3 years ago

Got closer on s390x but ran into this issues when compiling:

  CXX      dnspcap2protobuf.o
In file included from ../ext/protozero/include/protozero/pbf_writer.hpp:19,
                 from protozero.hh:24,
                 from dnspcap2protobuf.cc:32:
../ext/protozero/include/protozero/basic_pbf_writer.hpp:26:11: fatal error: protozero/byteswap.hpp: No such file or directory
 # include <protozero/byteswap.hpp>
           ^~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[1]: *** [Makefile:3215: dnspcap2protobuf.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory '/source/pdns'
make: *** [Makefile:2704: all] Error 2
make: Leaving directory '/source/pdns'
Habbie commented 3 years ago

checking for unixODBC library directory... configure: error: Did not find the unixodbc library dir in '/usr/local/unixodbc/lib/unixodbc /usr/local/lib/unixodbc /opt/unixodbc/lib/unixodbc /usr/lib/unixodbc /usr/lib64/unixodbc /usr/local/unixodbc/lib /usr/local/lib /opt/unixodbc/lib /usr/lib /usr/sfw/lib/ /usr/lib/odbc /usr/lib/x86_64-linux-gnu NONE/lib'

On my amd64 system, unixODBC is found in UNIXODBC_LIBS='-L/usr/lib/x86_64-linux-gnu -lodbc' which is not a very portable sounding place.

@kpfleming Did you run into similar issues build for arm64?

I did not run into this issue building packages on arm64; I'll try a Docker image now.

Habbie commented 3 years ago

Got closer on s390x but ran into this issues when compiling:

pbf_writer.hpp has:

#if PROTOZERO_BYTE_ORDER != PROTOZERO_LITTLE_ENDIAN
# include <protozero/byteswap.hpp>
#endif

so that broken include(path) never manifested on other systems!

Habbie commented 3 years ago

My arm64 docker build also cannot find unixODBC. This is good news because now I can fix it for you :)

Habbie commented 3 years ago

Our package builds (which is what @kpfleming tested) don't have the ODBC problem because dh_auto_configure (Debian packaging) passes --libdir to ./configure, and CentOS has odbc_config.

Habbie commented 3 years ago

I can build our Docker images on arm64 with master + https://gist.github.com/Habbie/bfc5aef067fd850a5137191534298ea9

Habbie commented 3 years ago

I can build our Docker images on arm64 with master + https://gist.github.com/Habbie/bfc5aef067fd850a5137191534298ea9

10028

Habbie commented 3 years ago

# include <protozero/byteswap.hpp>

Just opened this PR: protozero: make internal includes work #10030

I believe it should solve this problem for you.

james-crowley commented 3 years ago

@Habbie The fixes you posted worked with slight modifications.

For s390x, the patch for protozero worked great. That now allows big endian systems to run without issues. The changes I needed to make for s390x were:

    ./configure \
      --with-lua=lua5.3 \
      --sysconfdir=/etc/powerdns \
      --enable-option-checking=fatal \
      --with-dynmodules='bind geoip gmysql godbc gpgsql gsqlite3 ldap lmdb lua2 pipe random remote tinydns' \
      --enable-tools \
      --enable-ixfrdist \
      --with-unixodbc-lib=/usr/lib/$(uname -m)-linux-gnu && \

--with-lua needed to be set to lua5.3 to avoid using the LuaJIT which does not exist for s390x. Additionally, your fix for the --with-unixodbc-lib worked fine. The path on s390x is /usr/lib/s390x-linux-gnu.

For ppc64le, no patch was needed for protozero as the architecture is in little endian mode. Additionally, the LuaJIT does exist but it is not consider official. But debian and other distros decided to package it up. The changes I needed to make for ppc64le were:

    ./configure \
      --with-lua=luajit \
      --sysconfdir=/etc/powerdns \
      --enable-option-checking=fatal \
      --with-dynmodules='bind geoip gmysql godbc gpgsql gsqlite3 ldap lmdb lua2 pipe random remote tinydns' \
      --enable-tools \
      --enable-ixfrdist \
      --with-unixodbc-lib=/usr/lib/powerpc64le-linux-gnu && \

Unfortunately, ppc64le goes by a couple different names(sometimes ppc64el) and somehow the path for --with-unixodbc-lib ended up being /usr/lib/powerpc64le-linux-gnu. Which does throw a wrench into you fix of --with-unixodbc-lib=/usr/lib/$(uname -m)-linux-gnu.

I think you could still keep the one Dockerfile but you would need to add some bash "magic" to determine if the arch is ppc64le and if so replace the name with powerpc64le. Just for your information uname -m on ppc64le produces ppc64le.

Habbie commented 3 years ago

Thanks Jim, that's roughly the level of success I was expecting. All of this is solvable, indeed we'll need a bit of magic here and there.

Habbie commented 3 years ago

from a tip by @pieterlexis:

root@b4f07c08ad43:/# dpkg-architecture -q DEB_BUILD_GNU_TYPE
powerpc64le-linux-gnu
root@b4f07c08ad43:/# uname -m
ppc64le
james-crowley commented 3 years ago

@Habbie That is a good tip! Just confirmed the path on s390x: s390x-linux-gnu. Looks like that will work. Does it work on arm64? I do not have an arm system to test it on.

Habbie commented 3 years ago

I don't have a ppc64 to test on so I used docker run -ti --rm ppc64le/debian:buster (with qemu-static configured) - but, here's output from our AWS arm64 box:

peter@buildbot-worker-arm64-1:~$ dpkg-architecture -q DEB_BUILD_GNU_TYPE
aarch64-linux-gnu
peter@buildbot-worker-arm64-1:~$ uname -m
aarch64

So yes, that works!

Habbie commented 4 months ago

https://hub.docker.com/r/powerdns/pdns-recursor-51/tags has our first official dual-arch image. Feedback would be very welcome!