google / highway

Performance-portable, length-agnostic SIMD with runtime dispatch
Apache License 2.0
3.96k stars 307 forks source link

i386 attempts to compile AVX512BF16, but it should be disabled #2214

Closed jan-wassenberg closed 1 month ago

jan-wassenberg commented 1 month ago

Debian build log

The compile error is because HWY_AVX3_HAVE_F32_TO_BF16C is set, which is gated by

#if HWY_TARGET <= HWY_AVX3_ZEN4 && !HWY_COMPILER_CLANGCL &&           \
    (HWY_COMPILER_GCC_ACTUAL >= 1000 || HWY_COMPILER_CLANG >= 900) && \
    !defined(HWY_AVX3_DISABLE_AVX512BF16)

HWY_TARGET should not be AVX2 or below because we set

#if HWY_ARCH_X86_32 // i.e. defined(__i386__) || defined(_M_IX86)
#define HWY_BROKEN_32BIT (HWY_AVX2 | (HWY_AVX2 - 1))

Unfortunately godbolt seems to have dropped i386 gcc. Our 32-bit builder works, and indeed sets HWY_BROKEN_TARGETS correctly:

CFLAGS=-m32 CXXFLAGS=-m32 LDFLAGS=-m32 CXX=g++ CC=gcc cmake .. -DHWY_WARNINGS_ARE_ERRORS:BOOL=ON -DHWY_CMAKE_SSE2:BOOL=ON  -DCMAKE_BUILD_TYPE=Release

@malaterre , can you help me understand the difference, or how we are mis-detecting i386?

malaterre commented 1 month ago

I used this local patch:

see also:

should I remove this ?

jan-wassenberg commented 1 month ago

Ah, got it, thanks :) If we want to enable AVX2 for GCC13 that's fine, but AVX3 is definitely going to be a problem, as the build failure here shows. We can change your patch to disable HWY_AVX3 | (HWY_AVX3 - 1) i.e. AVX3 and any better.

Then I think this issue is resolved.