NLnetLabs / nsd

The NLnet Labs Name Server Daemon (NSD) is an authoritative, RFC compliant DNS nameserver.
https://nlnetlabs.nl/nsd
BSD 3-Clause "New" or "Revised" License
449 stars 102 forks source link

nsd 4.10.0 x86 builds failed #342

Closed chenrui333 closed 2 months ago

chenrui333 commented 3 months ago

👋 trying to build the latest release, but run into some build issue. The error log is as below:

error build log ``` In file included from ./src/haswell/parser.c:12: ./src/haswell/simd.h:88:21: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'simd_loadu_8x64' that is compiled without support for 'avx' simd->chunks[0] = _mm256_loadu_si256((const __m256i *)(address)); ^ ./src/haswell/simd.h:88:21: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:89:21: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'simd_loadu_8x64' that is compiled without support for 'avx' simd->chunks[1] = _mm256_loadu_si256((const __m256i *)(address+32)); ^ ./src/haswell/simd.h:89:21: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:95:21: error: always_inline function '_mm256_set1_epi8' requires target feature 'avx', but would be inlined into function 'simd_find_8x64' that is compiled without support for 'avx' const __m256i k = _mm256_set1_epi8(key); ^ ./src/haswell/simd.h:95:21: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:97:22: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'simd_find_8x64' that is compiled without support for 'avx2' const __m256i r0 = _mm256_cmpeq_epi8(simd->chunks[0], k); ^ ./src/haswell/simd.h:97:22: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:98:22: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'simd_find_8x64' that is compiled without support for 'avx2' const __m256i r1 = _mm256_cmpeq_epi8(simd->chunks[1], k); ^ ./src/haswell/simd.h:98:22: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:100:33: error: always_inline function '_mm256_movemask_epi8' requires target feature 'avx2', but would be inlined into function 'simd_find_8x64' that is compiled without support for 'avx2' const uint64_t m0 = (uint32_t)_mm256_movemask_epi8(r0); ^ ./src/haswell/simd.h:100:33: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:101:33: error: always_inline function '_mm256_movemask_epi8' requires target feature 'avx2', but would be inlined into function 'simd_find_8x64' that is compiled without support for 'avx2' const uint64_t m1 = (uint32_t)_mm256_movemask_epi8(r1); ^ ./src/haswell/simd.h:101:33: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:110:21: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'simd_find_any_8x64' that is compiled without support for 'avx' const __m256i t = _mm256_loadu_si256((const __m256i *)table); ^ ./src/haswell/simd.h:110:21: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:113:5: error: always_inline function '_mm256_shuffle_epi8' requires target feature 'avx2', but would be inlined into function 'simd_find_any_8x64' that is compiled without support for 'avx2' _mm256_shuffle_epi8(t, simd->chunks[0]), simd->chunks[0]); ^ ./src/haswell/simd.h:113:5: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI ./src/haswell/simd.h:112:22: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'simd_find_any_8x64' that is compiled without support for 'avx2' const __m256i r0 = _mm256_cmpeq_epi8( ^ fatal error: too many errors emitted, stopping now [-ferror-limit=] clang -I/usr/local/opt/libevent/include -I. -I/usr/local/opt/openssl@3/include -I./simdzone/include -g -O2 -flto -c namedb.c 20 errors generated. ```

full build log, https://github.com/Homebrew/homebrew-core/actions/runs/9499842527/job/26183926937 relates to Homebrew/homebrew-core#174464

k0ekk0ek commented 3 months ago

Hi @chenrui333!

I've commented the following on a Homebrew/homebrew-core#175796 opened by @anandb-ripencc:

Hi all! 👋 I'm looking at this issue to find out why this is not properly compiling for Homebrew. It seems like the compiler(s) aren't handling target dependencies correctly. E.g. code in westmere gets compiled with -march=westmere, but for Linux it's complaining about missing POPCNT (which is part of SSE4.2 and thus Westmere) and on Apple x86_64 it's complaining about needing PCMUL (also part of Westmere). Of course, there's also the errors related to Haswell. I tried to reproduce, but couldn't on my local machines.

The really quick-fix is to just pass --disable-westmere and --disable-haswell to configure. However, I would like to know why this issue exists. Perhaps, it's as simple as passing e.g. -mpopcnt and -mpclmul to get westmere working? Any way I can easily debug this in a Homebrew build environment?

@wcawijngaards made a PR that checks if Westmere and Haswell specific code actually compiles (see #211), but it needs a little bit of work. It might also not be the right way to go about this because now we at least see something is off, if we check on configure we simply disable it even though it should work.

I'd like to find out why this fails in the Homebrew environment, but I need a way to reproduce.

k0ekk0ek commented 2 months ago

@chenrui333, the fixes have been integrated in NLnetLabs/simdzone#211. Detection of intrinsics should fail, so you won't get the best performance kernels, but building should work just fine with the upcoming 4.10.1 release. Closing this. Be sure to let us know if things are still failing with the new release.