ARM-software / astc-encoder

The Arm ASTC Encoder, a compressor for the Adaptive Scalable Texture Compression data format.
https://developer.arm.com/graphics
Apache License 2.0
1.07k stars 238 forks source link

Build fails on i386 with -march=nehalem or newer #468

Closed VVD closed 5 months ago

VVD commented 5 months ago

I got this build error related to popcnt and i386:

/wrkdirs/usr/ports/graphics/bgfx/work/bgfx.cmake-1.127.8725-465/bimg/3rdparty/astc-encoder/source/astcenc_vecmathlib_sse_4.h:1313:26: error: use of undeclared identifier '_mm_popcnt_u64'; did you mean '_mm_popcnt_u32'?
        return static_cast<int>(_mm_popcnt_u64(v));
                                ^~~~~~~~~~~~~~
                                _mm_popcnt_u32
/usr/lib/clang/14.0.5/include/popcntintrin.h:33:1: note: '_mm_popcnt_u32' declared here
_mm_popcnt_u32(unsigned int __A)
^
1 error generated.

FreeBSD 13.2 i386, llvm 14.0.5, -march=nehalem or newer, as a workaround I can use -march=penryn to prevent usage of popcnt.

This patch work for me:

--- bimg/3rdparty/astc-encoder/source/astcenc_vecmathlib_sse_4.h.orig
+++ bimg/3rdparty/astc-encoder/source/astcenc_vecmathlib_sse_4.h
@@ -1309,5 +1309,7 @@ ASTCENC_SIMD_INLINE int popcount(uint64_t v)
 {
 #if defined(__MINGW32__)
        return static_cast<int>(__builtin_popcountll(v));
+#elif defined(__FreeBSD__) && !defined(__x86_64__)
+       return static_cast<int>(_mm_popcnt_u32(static_cast<uint32_t>(v)));
 #else
        return static_cast<int>(_mm_popcnt_u64(v));
 #endif // defined(__MINGW32__)
solidpixel commented 5 months ago

That fix is broken - it's truncating the value to 32-bits before taking the popcount.

Any reason you don't use the __builtin_popcountll() fallback, the same as __MINGW32__? I suspect that fallback could just be

#if !defined(__x86_64__)
        return static_cast<int>(__builtin_popcountll(v));
#else
VVD commented 5 months ago

I'm new to this, so I did it this way. Your patch fixed the build issue too and look better. Thanks!

solidpixel commented 5 months ago

Thanks for confirming - I'll get that in for 4.9.

VVD commented 5 months ago

If, before writing the patch, I had read what the popcnt instructions do, I would have suggested the following patch:

return static_cast<int>(_mm_popcnt_u32(static_cast<uint32_t>(v)) + _mm_popcnt_u32(static_cast<uint32_t>(v >> 32)));

:-)

VVD commented 5 months ago

Patch against main branch:

--- Source/astcenc_vecmathlib_sse_4.h.orig      2024-05-07 09:59:55 UTC
+++ Source/astcenc_vecmathlib_sse_4.h
@@ -1307,7 +1307,11 @@ ASTCENC_SIMD_INLINE vfloat4 dot3(vfloat4 a, vfloat4 b)
  */
 ASTCENC_SIMD_INLINE int popcount(uint64_t v)
 {
+#if defined(__x86_64__)
        return static_cast<int>(_mm_popcnt_u64(v));
+#else
+       return static_cast<int>(__builtin_popcountll(v));
+#endif
 }

 #endif // ASTCENC_POPCNT >= 1
solidpixel commented 5 months ago

Fixed on main.