hpjansson / chafa

📺🗿 Terminal graphics for the 21st century.
https://hpjansson.org/chafa/
GNU Lesser General Public License v3.0
2.97k stars 64 forks source link

chafa 1.14.1 fails to build on i686 #203

Closed tranzystorekk closed 5 months ago

tranzystorekk commented 5 months ago

Build log on Void Linux i686 CI: https://github.com/void-linux/void-packages/actions/runs/9554806547/job/26336662670?pr=50873

aeiouaeiouaeiouaeiouaeiouaeiou commented 5 months ago

A similar error occurs on Snow Leopard i386 buildbot in MacPorts:

/bin/sh ../libtool  --tag=CC   --mode=link /opt/local/bin/clang-mp-11 -DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_26 -Wall -Wextra -Wmissing-prototypes -Wwrite-strings -Wunused-macros -Wundef -Wpointer-arith -Werror=format-security -Wfor-loop-analysis -Wlogical-op-parentheses -ffast-math -fvisibility=hidden -I/opt/local/include/glib-2.0 -I/opt/local/lib/glib-2.0/include -DCHAFA_COMPILATION -pipe -Os -arch i386  -no-undefined -version-info 9:1:9 -L/opt/local/lib -Wl,-headerpad_max_install_names -arch i386 -o libchafa.la -rpath /opt/local/lib libchafa_la-chafa-canvas.lo libchafa_la-chafa-canvas-config.lo libchafa_la-chafa-features.lo libchafa_la-chafa-frame.lo libchafa_la-chafa-image.lo libchafa_la-chafa-placement.lo libchafa_la-chafa-symbol-map.lo libchafa_la-chafa-term-db.lo libchafa_la-chafa-term-info.lo libchafa_la-chafa-util.lo -L/opt/local/lib -lglib-2.0 -lintl internal/libchafa-internal.la -lm 
libtool: link: /opt/local/bin/clang-mp-11 -dynamiclib  -o .libs/libchafa.0.dylib  .libs/libchafa_la-chafa-canvas.o .libs/libchafa_la-chafa-canvas-config.o .libs/libchafa_la-chafa-features.o .libs/libchafa_la-chafa-frame.o .libs/libchafa_la-chafa-image.o .libs/libchafa_la-chafa-placement.o .libs/libchafa_la-chafa-symbol-map.o .libs/libchafa_la-chafa-term-db.o .libs/libchafa_la-chafa-term-info.o .libs/libchafa_la-chafa-util.o   -Wl,-force_load,internal/.libs/libchafa-internal.a  -L/opt/local/lib -lglib-2.0 -lintl -lm  -Os -arch i386 -Wl,-headerpad_max_install_names -arch i386   -install_name  /opt/local/lib/libchafa.0.dylib -compatibility_version 10 -current_version 10.1 -Wl,-single_module
Undefined symbols for architecture i386:
  "__mm_extract_epi64", referenced from:
      _calc_colors_avx2 in libchafa-internal.a(libchafa_avx2_la-chafa-avx2.o)
      _chafa_color_accum_div_scalar_avx2 in libchafa-internal.a(libchafa_avx2_la-chafa-avx2.o)
ld: symbol(s) not found for architecture i386
clang: error: linker command failed with exit code 1 (use -v to see invocation)

This is most likely due to this code fragment in configure.ac. For optimization purposes it should be called strictly for x86-64 compatible platforms, but for some reason this check is not done.

hpjansson commented 5 months ago

Thanks for reporting this. I think the issue in both cases is that the _mm_extract_epi64() macro is undefined in the build environment. Most of the other intrinsics are defined, so the configure test passes. Should be simple to fix.

The AVX code is guarded by a run-time check, so it won't be used on CPUs that don't support AVX.

hpjansson commented 5 months ago

Would you be able to quickly test a fix? I don't have a way to reproduce the build error at the moment, but I have this in mind:

diff --git a/chafa/internal/chafa-avx2.c b/chafa/internal/chafa-avx2.c
index ae20830..8efbc9d 100644
--- a/chafa/internal/chafa-avx2.c
+++ b/chafa/internal/chafa-avx2.c
@@ -21,6 +21,7 @@

 #include <emmintrin.h>
 #include <immintrin.h>
+#include <smmintrin.h>  /* For _mm_extract_epi64 */
 #include "chafa.h"
 #include "internal/chafa-private.h"
tranzystorekk commented 5 months ago

With that patch nothing changes: chafa__do_build.log

Here's the configure log for completeness: chafa__do_configure.log

hpjansson commented 5 months ago

Thanks. The logs indicate that _mm_extract_epi32() is available, but not _mm_extract_epi64(). The underlying issue seems to be that the pextrq instruction is not available in 32-bit mode. I'll have to work around that.

hpjansson commented 5 months ago

This built with -m32 here:

diff --git a/chafa/internal/chafa-avx2.c b/chafa/internal/chafa-avx2.c
index ae20830..7f11687 100644
--- a/chafa/internal/chafa-avx2.c
+++ b/chafa/internal/chafa-avx2.c
@@ -21,9 +21,21 @@

 #include <emmintrin.h>
 #include <immintrin.h>
+#include <smmintrin.h>
 #include "chafa.h"
 #include "internal/chafa-private.h"

+/* _mm_extract_epi64() (pextrq) is not available in 32-bit mode. Work around
+ * it. This needs to be a macro, as the compiler expects an integer constant
+ * for n. */
+#if defined __x86_64__ && !defined __ILP32__
+# define extract_128_epi64(i, n) _mm_extract_epi64 ((i), (n))
+#else
+# define extract_128_epi64(i, n) \
+    ((((guint64) _mm_extract_epi32 ((i), (n) * 2 + 1)) << 32) \
+     | _mm_extract_epi32 ((i), (n) * 2))
+#endif
+
 gint
 calc_error_avx2 (const ChafaPixel *pixels, const ChafaColorPair *color_pair,
                  const guint32 *sym_mask_u32)
@@ -96,14 +108,14 @@ calc_colors_avx2 (const ChafaPixel *pixels, ChafaColorAccum *accums_out,
     accum_bg_128 = _mm_add_epi16 (_mm256_extracti128_si256 (accum_bg, 0),
                                   _mm256_extracti128_si256 (accum_bg, 1));
     ((guint64 *) accums_out) [0] =
-        (guint64) _mm_extract_epi64 (accum_bg_128, 0)
-        + (guint64) _mm_extract_epi64 (accum_bg_128, 1);
+        extract_128_epi64 (accum_bg_128, 0)
+        + extract_128_epi64 (accum_bg_128, 1);

     accum_fg_128 = _mm_add_epi16 (_mm256_extracti128_si256 (accum_fg, 0),
                                   _mm256_extracti128_si256 (accum_fg, 1));
     ((guint64 *) accums_out) [1] =
-        (guint64) _mm_extract_epi64 (accum_fg_128, 0)
-        + (guint64) _mm_extract_epi64 (accum_fg_128, 1);
+        extract_128_epi64 (accum_fg_128, 0)
+        + extract_128_epi64 (accum_fg_128, 1);
 }

 /* 32768 divided by index. Divide by zero is defined as zero. */
@@ -143,5 +155,5 @@ chafa_color_accum_div_scalar_avx2 (ChafaColorAccum *accum, guint16 divisor)
     accum_128 = _mm_loadl_epi64 ((const __m128i *) accum);
     divisor_128 = _mm_set1_epi16 (invdiv16 [divisor]);
     accum_128 = _mm_mulhrs_epi16 (accum_128, divisor_128);
-    *((guint64 *) accum) = _mm_extract_epi64 (accum_128, 0);
+    *((guint64 *) accum) = extract_128_epi64 (accum_128, 0);
tranzystorekk commented 5 months ago

Yep, Void CI passes with this patch: https://github.com/void-linux/void-packages/actions/runs/9587902854/job/26438847164?pr=50873

Thank you for your time!

hpjansson commented 5 months ago

Great! Thanks for distributing it. Are you ok carrying the patch for a couple of weeks, until I can make a new release?

tranzystorekk commented 5 months ago

Are you ok carrying the patch for a couple of weeks, until I can make a new release?

Sure, no problem :)