Closed tranzystorekk closed 5 months ago
A similar error occurs on Snow Leopard i386 buildbot in MacPorts:
/bin/sh ../libtool --tag=CC --mode=link /opt/local/bin/clang-mp-11 -DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_26 -Wall -Wextra -Wmissing-prototypes -Wwrite-strings -Wunused-macros -Wundef -Wpointer-arith -Werror=format-security -Wfor-loop-analysis -Wlogical-op-parentheses -ffast-math -fvisibility=hidden -I/opt/local/include/glib-2.0 -I/opt/local/lib/glib-2.0/include -DCHAFA_COMPILATION -pipe -Os -arch i386 -no-undefined -version-info 9:1:9 -L/opt/local/lib -Wl,-headerpad_max_install_names -arch i386 -o libchafa.la -rpath /opt/local/lib libchafa_la-chafa-canvas.lo libchafa_la-chafa-canvas-config.lo libchafa_la-chafa-features.lo libchafa_la-chafa-frame.lo libchafa_la-chafa-image.lo libchafa_la-chafa-placement.lo libchafa_la-chafa-symbol-map.lo libchafa_la-chafa-term-db.lo libchafa_la-chafa-term-info.lo libchafa_la-chafa-util.lo -L/opt/local/lib -lglib-2.0 -lintl internal/libchafa-internal.la -lm
libtool: link: /opt/local/bin/clang-mp-11 -dynamiclib -o .libs/libchafa.0.dylib .libs/libchafa_la-chafa-canvas.o .libs/libchafa_la-chafa-canvas-config.o .libs/libchafa_la-chafa-features.o .libs/libchafa_la-chafa-frame.o .libs/libchafa_la-chafa-image.o .libs/libchafa_la-chafa-placement.o .libs/libchafa_la-chafa-symbol-map.o .libs/libchafa_la-chafa-term-db.o .libs/libchafa_la-chafa-term-info.o .libs/libchafa_la-chafa-util.o -Wl,-force_load,internal/.libs/libchafa-internal.a -L/opt/local/lib -lglib-2.0 -lintl -lm -Os -arch i386 -Wl,-headerpad_max_install_names -arch i386 -install_name /opt/local/lib/libchafa.0.dylib -compatibility_version 10 -current_version 10.1 -Wl,-single_module
Undefined symbols for architecture i386:
"__mm_extract_epi64", referenced from:
_calc_colors_avx2 in libchafa-internal.a(libchafa_avx2_la-chafa-avx2.o)
_chafa_color_accum_div_scalar_avx2 in libchafa-internal.a(libchafa_avx2_la-chafa-avx2.o)
ld: symbol(s) not found for architecture i386
clang: error: linker command failed with exit code 1 (use -v to see invocation)
This is most likely due to this code fragment in configure.ac. For optimization purposes it should be called strictly for x86-64 compatible platforms, but for some reason this check is not done.
Thanks for reporting this. I think the issue in both cases is that the _mm_extract_epi64()
macro is undefined in the build environment. Most of the other intrinsics are defined, so the configure test passes. Should be simple to fix.
The AVX code is guarded by a run-time check, so it won't be used on CPUs that don't support AVX.
Would you be able to quickly test a fix? I don't have a way to reproduce the build error at the moment, but I have this in mind:
diff --git a/chafa/internal/chafa-avx2.c b/chafa/internal/chafa-avx2.c
index ae20830..8efbc9d 100644
--- a/chafa/internal/chafa-avx2.c
+++ b/chafa/internal/chafa-avx2.c
@@ -21,6 +21,7 @@
#include <emmintrin.h>
#include <immintrin.h>
+#include <smmintrin.h> /* For _mm_extract_epi64 */
#include "chafa.h"
#include "internal/chafa-private.h"
With that patch nothing changes: chafa__do_build.log
Here's the configure log for completeness: chafa__do_configure.log
Thanks. The logs indicate that _mm_extract_epi32()
is available, but not _mm_extract_epi64()
. The underlying issue seems to be that the pextrq
instruction is not available in 32-bit mode. I'll have to work around that.
This built with -m32
here:
diff --git a/chafa/internal/chafa-avx2.c b/chafa/internal/chafa-avx2.c
index ae20830..7f11687 100644
--- a/chafa/internal/chafa-avx2.c
+++ b/chafa/internal/chafa-avx2.c
@@ -21,9 +21,21 @@
#include <emmintrin.h>
#include <immintrin.h>
+#include <smmintrin.h>
#include "chafa.h"
#include "internal/chafa-private.h"
+/* _mm_extract_epi64() (pextrq) is not available in 32-bit mode. Work around
+ * it. This needs to be a macro, as the compiler expects an integer constant
+ * for n. */
+#if defined __x86_64__ && !defined __ILP32__
+# define extract_128_epi64(i, n) _mm_extract_epi64 ((i), (n))
+#else
+# define extract_128_epi64(i, n) \
+ ((((guint64) _mm_extract_epi32 ((i), (n) * 2 + 1)) << 32) \
+ | _mm_extract_epi32 ((i), (n) * 2))
+#endif
+
gint
calc_error_avx2 (const ChafaPixel *pixels, const ChafaColorPair *color_pair,
const guint32 *sym_mask_u32)
@@ -96,14 +108,14 @@ calc_colors_avx2 (const ChafaPixel *pixels, ChafaColorAccum *accums_out,
accum_bg_128 = _mm_add_epi16 (_mm256_extracti128_si256 (accum_bg, 0),
_mm256_extracti128_si256 (accum_bg, 1));
((guint64 *) accums_out) [0] =
- (guint64) _mm_extract_epi64 (accum_bg_128, 0)
- + (guint64) _mm_extract_epi64 (accum_bg_128, 1);
+ extract_128_epi64 (accum_bg_128, 0)
+ + extract_128_epi64 (accum_bg_128, 1);
accum_fg_128 = _mm_add_epi16 (_mm256_extracti128_si256 (accum_fg, 0),
_mm256_extracti128_si256 (accum_fg, 1));
((guint64 *) accums_out) [1] =
- (guint64) _mm_extract_epi64 (accum_fg_128, 0)
- + (guint64) _mm_extract_epi64 (accum_fg_128, 1);
+ extract_128_epi64 (accum_fg_128, 0)
+ + extract_128_epi64 (accum_fg_128, 1);
}
/* 32768 divided by index. Divide by zero is defined as zero. */
@@ -143,5 +155,5 @@ chafa_color_accum_div_scalar_avx2 (ChafaColorAccum *accum, guint16 divisor)
accum_128 = _mm_loadl_epi64 ((const __m128i *) accum);
divisor_128 = _mm_set1_epi16 (invdiv16 [divisor]);
accum_128 = _mm_mulhrs_epi16 (accum_128, divisor_128);
- *((guint64 *) accum) = _mm_extract_epi64 (accum_128, 0);
+ *((guint64 *) accum) = extract_128_epi64 (accum_128, 0);
Yep, Void CI passes with this patch: https://github.com/void-linux/void-packages/actions/runs/9587902854/job/26438847164?pr=50873
Thank you for your time!
Great! Thanks for distributing it. Are you ok carrying the patch for a couple of weeks, until I can make a new release?
Are you ok carrying the patch for a couple of weeks, until I can make a new release?
Sure, no problem :)
Build log on Void Linux i686 CI: https://github.com/void-linux/void-packages/actions/runs/9554806547/job/26336662670?pr=50873