void-linux / void-packages

The Void source packages collection
https://voidlinux.org
Other
2.59k stars 2.16k forks source link

darktable>=3.0.x: build fails for aarch64* with openmp #21105

Closed lemmi closed 1 year ago

lemmi commented 4 years ago

Opening this issue to document the error when cross compiling darktable for aarch64:

/builddir/darktable-3.0.1/src/common/bilateral.c:332:6: warning: GCC does not currently support mixed size types for 'simd' functions
  332 | void dt_bilateral_slice_to_output(const dt_bilateral_t *const b, const float *const in, float *out,
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
/builddir/darktable-3.0.1/src/common/bilateral.c:276:6: warning: GCC does not currently support mixed size types for 'simd' functions
  276 | void dt_bilateral_slice(const dt_bilateral_t *const b, const float *const in, float *out, const float detail)
      |      ^~~~~~~~~~~~~~~~~~
/tmp/cc0eANtK.s: Assembler messages:
/tmp/cc0eANtK.s:3724: Error: unknown pseudo-op: `.variant_pcs'
/tmp/cc0eANtK.s:3861: Error: unknown pseudo-op: `.variant_pcs'
/tmp/cc0eANtK.s:3986: Error: unknown pseudo-op: `.variant_pcs'
/tmp/cc0eANtK.s:4164: Error: unknown pseudo-op: `.variant_pcs'
make[2]: *** [src/CMakeFiles/lib_darktable.dir/build.make:136: src/CMakeFiles/lib_darktable.dir/common/bilateral.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:1829: src/CMakeFiles/lib_darktable.dir/all] Error 2
make: *** [Makefile:172: all] Error 2
Sturmflut commented 3 years ago

I just ran into the same problem when building for Fujitsu A64FX with GCC 9.2.1 on CentOS 8.2. This is the full compiler all:

/opt/rh/gcc-toolset-9/root/usr/bin/cc -DDT_HAVE_SIGNAL_TRACE -DGDK_DISABLE_DEPRECATED -DGDK_VERSION_MIN_REQUIRED=GDK_VERSION_3_22 -DGLIB_VERSION_MAX_ALLOWED=GLIB_VERSION_MIN_REQUIRED -DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_40 -DGTK_DISABLE_DEPRECATED -DGTK_DISABLE_SINGLE_INCLUDES -DHAVE_CONFIG_H -DHAVE_GAME -DHAVE_ICU -DHAVE_KWALLET -DHAVE_LENSFUN -DHAVE_OPENCL -DHAVE_SQLITE_324_OR_NEWER -DSQLITE_CORE -DSQLITE_ENABLE_ICU -D_XOPEN_SOURCE=700 -D__GDK_KEYSYMS_COMPAT_H__ -Dlib_darktable_EXPORTS -I/home/scc/mr2515/darktable/build/bin -I/home/scc/mr2515/darktable/src -isystem /home/scc/mr2515/darktable/src/external -isystem /home/scc/mr2515/darktable/src/external/OpenCL -I/home/scc/mr2515/darktable/build/lib64/darktable/rawspeed/src -isystem /home/scc/mr2515/darktable/src/external/rawspeed/src/external -I/home/scc/mr2515/darktable/src/external/rawspeed/src/librawspeed -I/home/scc/mr2515/darktable/src/external/whereami/src -isystem /usr/include/glib-2.0 -isystem /usr/lib64/glib-2.0/include -isystem /usr/include/gtk-3.0 -isystem /usr/include/pango-1.0 -isystem /usr/include/fribidi -isystem /usr/include/cairo -isystem /usr/include/pixman-1 -isystem /usr/include/freetype2 -isystem /usr/include/libpng16 -isystem /usr/include/uuid -isystem /usr/include/harfbuzz -isystem /usr/include/gdk-pixbuf-2.0 -isystem /usr/include/gio-unix-2.0 -isystem /usr/include/atk-1.0 -isystem /usr/include/at-spi2-atk/2.0 -isystem /usr/include/at-spi-2.0 -isystem /usr/include/dbus-1.0 -isystem /usr/lib64/dbus-1.0/include -isystem /usr/include/libxml2 -isystem /home/scc/mr2515/local/include -isystem /usr/include/librsvg-2.0 -isystem /usr/include/json-glib-1.0 -Wall -Wformat -Wformat-security -Wshadow -Wtype-limits -Wvla -Wold-style-declaration -Wno-unknown-pragmas -Wno-error=varargs -Wno-format-truncation -Wno-error=address-of-packed-member -std=c99 -fopenmp -march=native -g -O2 -g -DNDEBUG -O2 -fPIC -Werror -Wfatal-errors -o CMakeFiles/lib_darktable.dir/common/bilateral.c.o -c /home/scc/mr2515/darktable/src/common/bilateral.c

Looks like an actual compiler limitation on ARMv8A. The other compilers on the system are Cray 10 and Clang 10, but both don't pass the "supported platform" test in the CMakeFile.

I'll try to build GCC 10.

lemmi commented 3 years ago

Thanks for reporting. At the moment we turned openmp off, which is kind of a shame, since we now seem to get affordable multicore machines with aarch64. Which darktable version is this?

github-actions[bot] commented 2 years ago

Issues become stale 90 days after last activity and are closed 14 days after that. If this issue is still relevant bump it or assign it.

Sturmflut commented 2 years ago

Funnily I came across my own post in this issue one and a half years later when trying to build darktable on a Raspberry Pi.

It looks like this problem still exists with GCC 10.2.1 and the current darktable master (I just tried commit ab7e374330a9e50abad0f2784bda4b319e770239, a bunch of commits after the 4.0 release).

An example for code that raises this error is:

// Kahan summation algorithm
#ifdef _OPENMP
#pragma omp declare simd aligned(c)
#endif
static inline float Kahan_sum(const float m, float *const __restrict__ c, const float add)
{
   const float t1 = add - (*c);
   const float t2 = m + t1;
   *c = (t2 - m) - t1;
   return t2;
}

The two const float are 32 bit, the float *const should be 64 bit on AArch64. The limitation that all types must be of the same length for OpenMD SIMD only exists for AArch64 in GCC, and this is still the case in the latest GCC 12.2.0. So it's basically possible to use all parts of OpenMP except SIMD.

Theoretically this should be accomplished with CFLAGS="-fopenmp -fno-openmp-simd". But it isn't. There is also no GCC macro to check for OpenMP SIMD support or something like that (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80502), so apart from changing the code I don't see an other option than to completely disable OpenMP when building on AArch64.

lemmi commented 2 years ago

Thanks for the update

lemmi commented 2 years ago

So this made me have another look, whether I couldn't find a solution upstream. Seems like the flag is also necessary on macos, but they also add -DBUILD_SSE2_CODEPATHS=OFF. That made me try to replace -DUSE_OPENMP=OFF with that:

diff --git a/srcpkgs/darktable/template b/srcpkgs/darktable/template
index 9d0ba9ba70..58fd4fd818 100644
--- a/srcpkgs/darktable/template
+++ b/srcpkgs/darktable/template
@@ -40,6 +40,6 @@ fi

 case $XBPS_TARGET_MACHINE in
        aarch64*)
-               configure_args+=" -DUSE_OPENMP=OFF"
+               configure_args+=" -DBUILD_SSE2_CODEPATHS=OFF"
                ;;
 esac

Maybe someone can try whether this produces a working binary?

Regardless, without -DUSE_OPENMP=OFF I still get

/builddir/darktable-4.0.0/src/common/iop_profile.h:286:20: warning: GCC does not currently support mixed size types for 'simd' functions
  286 | static inline void dt_ioppr_rgb_matrix_to_xyz(const dt_aligned_pixel_t rgb, dt_aligned_pixel_t xyz,
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~

But these issues are now warnings (at least with darktable-4.0), not errors anymore and darktable compiles.

Sturmflut commented 2 years ago

BUILD_SSE2_CODEPATHS=OFF should be set automatically by CMake. USE_OPENMP=OFF is undesirable if you have more than one CPU core.

As a Proof-of-Concept, I converted all the #ifdef _OPENMP to #if defined(_OPENMP) && !defined(__aarch64__) if they wrapped around a #pragma omp simd. The attached patch does this for the current darktable master. It builds with OpenMP on a Raspberry Pi 3 and runs in parallel on all four cores.

Obviously this has to be fixed first and foremost in GCC. If the OpenMP standard supports mixed types for OpenMP SIMD and GCC supports mixed types on other platforms, it definitely also should do so on AArch64.

lemmi commented 2 years ago

Yeah, we shouldn't maintain such a patch, agreed. Have you by any chance tried building darktable with llvm?

Sturmflut commented 2 years ago

I just did. You need the following patch to get Clang 11 working:

--- a/src/is_supported_platform.h
+++ b/src/is_supported_platform.h
@@ -30,7 +30,7 @@
 #define DT_SUPPORTED_X86 0
 #endif

-#if defined(__aarch64__) && (defined(__ARM_64BIT_STATE) && defined(__ARM_ARCH) && defined(__ARM_ARCH_8A) || defined(__APPLE__) || defined(__MINGW64__))
+#if defined(__aarch64__) && (defined(__ARM_64BIT_STATE) && defined(__ARM_ARCH) && (defined(__ARM_ARCH_8A) || __ARM_ARCH_PROFILE == 'A') || defined(__APPLE__) || defined(__MINGW64__))
 #define DT_SUPPORTED_ARMv8A 1
 #else
 #define DT_SUPPORTED_ARMv8A 0

It compiles fine without any modifications to the OpenMP SIMD statements and the generated binary seems to work perfectly, utilizing all four cores on the Raspberry Pi 3 Model B.

I've sent a Pull Request with this patch to the darktable devs, see darktable-org/darktable#12388.

lemmi commented 2 years ago

@Sturmflut I see the patch was accepted upstream. This means we need to conditionally build darktable with llvm, correct? I need to wait for 4.0.1 anyway, since 4.0.0 has a couple of issue that need to be resolved. Do you already have something in place I can use for the llvm build, so I can potentially add this to 4.0.1?

github-actions[bot] commented 1 year ago

Issues become stale 90 days after last activity and are closed 14 days after that. If this issue is still relevant bump it or assign it.