s3gw-tech / s3gw

Container able to run on a Kubernetes cluster, providing S3-compatible endpoints to applications.
https://s3gw.tech
Apache License 2.0
126 stars 20 forks source link

[FR] Multiarch support #830

Open m-ildefons opened 10 months ago

m-ildefons commented 10 months ago

Longhorn ships for amd64, arm64 and experimentally for s390x. To complete integration with Longhorn, the s3gw needs to be built for all of Longhorn's supported architectures.

The current roadmap for aligning s3gw's target architectures and Longhorn's supported architectures is as follows:

amd64 arm64 s390x
Longhorn 1.6 experimental
Longhorn 1.7 stable stable
tserong commented 10 months ago

I enabled aarch64 builds for our 15.5 dependencies in https://build.opensuse.org/project/show/filesystems:ceph:s3gw. These all built successfully, which is enough to allow docker build --file Dockerfile --target s3gw --platform linux/aarch64 . to actually attempt a build. It sets up the buildenv correctly, but when it runs ninja to do the build, fails with:

01.6 CMake Error:
201.6   Running
201.6 
201.6    '/usr/bin/ninja' '-C' '/srv/ceph/build' '-t' 'recompact'
201.6 
201.6   failed with:
201.6 
201.6    ninja: error: build.ninja:17679: unknown pool name 'heavy_compile_job_pool'

Not knowing what's up with that, I tried switching to using make instead (dropping the unit tests because I'm not sure about their target names when run with make), in case that would give us a quick win:

index 4f0f6192a04..062eb4acd0c 100755
--- a/qa/rgw/store/sfs/build-radosgw.sh
+++ b/qa/rgw/store/sfs/build-radosgw.sh
@@ -37,7 +37,6 @@ CC=${CC:-"gcc-12"}
 CXX=${CXX:-"g++-12"}

 CEPH_CMAKE_ARGS=(
-  "-GNinja"
   "-DBOOST_J=${NPROC}"
   "-DCMAKE_C_COMPILER=${CC}"
   "-DCMAKE_CXX_COMPILER=${CXX}"
@@ -111,14 +110,7 @@ _configure() {
 _build() {
   pushd "${SFS_BUILD_DIR}"

-  ninja -j "${NPROC}" bin/radosgw crypto_plugins
-
-  if [ "${WITH_TESTS}" == "ON" ] ; then
-    # discover tests from ctest tags. Selects all tests which have the tag s3gw
-    mapfile -t \
-      UNIT_TESTS <<< "$(ctest -N -L s3gw | grep "Test #" | awk '{print $3}')"
-    ninja -j "${NPROC}" "${UNIT_TESTS[@]}"
-  fi
+  make -j "${NPROC}" radosgw crypto_plugins

   popd
 }

Narrator: It did not give us a quick win

This failed with:

289.6 [  0%] Generating mon_options.cc, ../../../include/mon_legacy_options.h
291.5 In file included from /srv/ceph/src/common/config_values.h:59,
291.5                  from /srv/ceph/src/common/config.h:27,
291.5                  from /srv/ceph/src/common/config_proxy.h:6,
291.5                  from /srv/ceph/src/common/ceph_context.h:41,
291.5                  from /srv/ceph/src/common/dout.h:29,
291.5                  from /srv/ceph/src/include/Context.h:19,
291.5                  from /srv/ceph/src/msg/Message.h:28,
291.5                  from /srv/ceph/src/msg/DispatchQueue.cc:15:
291.5 /srv/ceph/src/common/options/legacy_config_opts.h:6:10: fatal error: mon_legacy_options.h: No such file or directory
291.5     6 | #include "mon_legacy_options.h"
291.5       |          ^~~~~~~~~~~~~~~~~~~~~~
291.5 compilation terminated.
291.5 make[3]: *** [src/msg/CMakeFiles/common-msg-objs.dir/build.make:76: src/msg/CMakeFiles/common-msg-objs.dir/DispatchQueue.cc.o] Error 1
291.5 make[2]: *** [CMakeFiles/Makefile2:4323: src/msg/CMakeFiles/common-msg-objs.dir/all] Error 2
291.5 make[2]: *** Waiting for unfinished jobs....
291.8 [  0%] Generating osd_options.cc, ../../../include/osd_legacy_options.h
294.2 [  4%] Generating mds_options.cc, ../../../include/mds_legacy_options.h
299.2 [  4%] Generating rbd-mirror_options.cc, ../../../include/rbd-mirror_legacy_options.h
300.9 In file included from /srv/ceph/src/common/config_values.h:59,
300.9                  from /srv/ceph/src/common/config.h:27,
300.9                  from /srv/ceph/src/common/config_proxy.h:6,
300.9                  from /srv/ceph/src/common/ceph_context.h:41,
300.9                  from /srv/ceph/src/osd/osd_types.h:40,
300.9                  from /srv/ceph/src/crush/CrushWrapper.cc:4:
300.9 /srv/ceph/src/common/options/legacy_config_opts.h:8:10: fatal error: rbd_legacy_options.h: No such file or directory
300.9     8 | #include "rbd_legacy_options.h"
300.9       |          ^~~~~~~~~~~~~~~~~~~~~~
300.9 compilation terminated.

So we should really try to figure out what's up with that ninja failure.

tserong commented 10 months ago

I figured it out. heavy_compile_job_pool is referenced in a couple of places in ceph's CMakeLists.txt files. The one that was biting us was in src/tools/ceph-dencoder/CMakeLists.txt. That's set up in https://github.com/aquarist-labs/ceph/blob/s3gw/cmake/modules/LimitJobs.cmake, but it only works if cmake_host_system_information(RESULT _num_cores QUERY NUMBER_OF_LOGICAL_CORES) returns a number greater than zero. This will always be the case on a real system, but unfortunately when docker build is run with --platform linux/aarch64, qemu-aarch64 gives a /proc/cpuinfo with no processor lines, which is what cmake actually counts to figure out how many logical processors there are (see https://gitlab.kitware.com/cmake/cmake/-/blob/master/Source/kwsys/SystemInformation.cxx#L3438-3467).

There's a pretty straightforward workaround for that:

--- a/cmake/modules/LimitJobs.cmake
+++ b/cmake/modules/LimitJobs.cmake
@@ -2,6 +2,11 @@ set(MAX_COMPILE_MEM 3500 CACHE INTERNAL "maximum memory used by each compiling j
 set(MAX_LINK_MEM 4500 CACHE INTERNAL "maximum memory used by each linking job (in MiB)")

 cmake_host_system_information(RESULT _num_cores QUERY NUMBER_OF_LOGICAL_CORES)
+# This will never be zero on a real system, but it can be if you're doing
+# weird things like trying to cross-compile using qemu emulation.
+if(_num_cores EQUAL 0)
+  set(_num_cores 1)
+endif()
 cmake_host_system_information(RESULT _total_mem QUERY TOTAL_PHYSICAL_MEMORY)

 math(EXPR _avg_compile_jobs "${_total_mem} / ${MAX_COMPILE_MEM}")

...so now I've got a build running. More to follow if/when it eventually completes (it is not fast)...

tserong commented 9 months ago

With https://github.com/aquarist-labs/ceph/pull/256 applied, I eventually got an aarch64 s3gw container built. It took TWENTY TWO HOURS running on a single core under qemu-aarch64, but it did build successfully, so at least we know it works. To do this for real, we'll want actual aarch64 builders.

tserong commented 9 months ago

With https://github.com/aquarist-labs/ceph/pull/259 applied, which falls back to using nproc to get the number of cores, I got a build in only 7.4 hours. On my 8-core desktop it seems to have pretty consistently used 6 cores. On a newer, faster system with more cores, I'd expect an even better build time.

m-ildefons commented 9 months ago

This is good news. I wonder if there's a bottleneck elsewhere since it still takes way longer than I anticipated. Anyways, we can throw up to 32 cores at the problem with our current workers, which should make this workable. Now I'll just need to figure out how to get qemu+buildx working in the pipelines without seeing the same problems as before.

tserong commented 9 months ago

I wonder if there's a bottleneck elsewhere since it still takes way longer than I anticipated.

It could just be my desktop system sucks a bit. I can try to force an aarch64 build through CI if you like, by opening a PR something like this:

--- a/.github/workflows/test-s3gw.yml
+++ b/.github/workflows/test-s3gw.yml
@@ -56,6 +56,7 @@ jobs:
       - name: Build Unittests
         run: |
           docker build \
+            --platform linux/aarch64 \
             --build-arg CMAKE_BUILD_TYPE=Debug \
             --build-arg NPROC=16 \
             --file s3gw/Dockerfile \
@@ -65,11 +66,12 @@ jobs:

       - name: Run Unittests
         run: |
-          docker run --rm s3gw-unittests:${IMAGE_TAG}
+          docker run --platform linux/aarch64 --rm s3gw-unittests:${IMAGE_TAG}

       - name: Build s3gw Container Image
         run: |
           docker build \
+            --platform linux/aarch64 \
             --build-arg CMAKE_BUILD_TYPE=Debug \
             --build-arg NPROC=16 \
             --build-arg SRC_S3GW_DIR=s3gw \
@@ -85,7 +87,7 @@ jobs:
           source ceph/qa/rgw/store/sfs/tests/helpers.sh

           mkdir -p integration/storage
-          CONTAINER=$(docker run --rm -d \
+          CONTAINER=$(docker run --platform linux/aarch64 --rm -d \
             -p 7480:7480 \
             -v $GITHUB_WORKSPACE/integration/storage:/data \
             s3gw:${IMAGE_TAG} \
@@ -110,7 +112,7 @@ jobs:
           source ceph/qa/rgw/store/sfs/tests/helpers.sh

           mkdir -p smoke/storage
-          CONTAINER=$(docker run --rm -d \
+          CONTAINER=$(docker run --platform linux/aarch64 --rm -d \
             -p 7480:7480 \
             -v $GITHUB_WORKSPACE/smoke/storage:/data \
             s3gw:${IMAGE_TAG} \
@@ -128,7 +130,7 @@ jobs:
         run: |
           set -x

-          docker run --rm \
+          docker run --platform linux/aarch64 --rm \
             -v /run/docker.sock:/run/docker.sock \
             -v ${GITHUB_WORKSPACE}/s3tr-out:/out \
             --pull=always \
@@ -149,7 +151,7 @@ jobs:
         run: |
           set -x

-          docker run --rm \
+          docker run --platform linux/aarch64 --rm \
             -v ${GITHUB_WORKSPACE}/s3tr-out:/out \
             -v ${GITHUB_WORKSPACE}/ceph:/ceph:ro \
             ghcr.io/aquarist-labs/s3tr:latest \

(The above is untested, but the docker build lines are essentially what I ran here on my desktop system)

m-ildefons commented 9 months ago

You're using buildx and QEMU, right? This would need to be set up for the workers first. There are GH actions to do that easily. We even had them set up until last week when it gave us trouble (again :roll_eyes:).

tserong commented 9 months ago

You're using buildx and QEMU, right?

Yeah, but it Just Worked[TM], i.e. I didn't do anything other than install docker on my desktop and somehow it knew how to do the right thing when I ran docker build --platform linux/aarch64. My thought was to add that argument just as a one off-test to see how quickly we could theoretically get a build out of CI on arm.