openvswitch / ovs-issues

Issue tracker repo for Open vSwitch
10 stars 3 forks source link

OVS 2.14 build fails if DPDK is build with -mno-avx512f #201

Closed fmoessbauer closed 3 years ago

fmoessbauer commented 3 years ago

In OVS 2.14, AVX512 support has been enabled / introduced. Support for that is auto-detected based compiler support.

Also, the flags DPDK has been compiled with are inspected using the package config file. If DPDK was compiled without AVX512 support (e.g. by -march=corei7 or -mno-avx512f), OVS cannot be build due to the following error:

/usr/lib/gcc/x86_64-linux-gnu/8/include/avx512fintrin.h:6216:1: error: inlining failed in call to always_inline ‘_mm512_maskz_loadu_epi64’: target specific option mismatch
 _mm512_maskz_loadu_epi64 (__mmask8 __U, void const *__P)
 ^~~~~~~~~~~~~~~~~~~~~~~~
../lib/dpif-netdev-lookup-avx512-gather.c:79:24: note: called from here
     __m512i v_mask   = _mm512_maskz_loadu_epi64(lane_mask, &maskp[0]);
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/8/include/immintrin.h:45,
                 from /usr/lib/gcc/x86_64-linux-gnu/8/include/x86intrin.h:48,
                 from /usr/include/dpdk/../x86_64-linux-gnu/dpdk/rte_vect.h:28,
                 from /usr/include/dpdk/../x86_64-linux-gnu/dpdk/rte_memcpy.h:17,
                 from /usr/include/dpdk/rte_mempool.h:51,
                 from /usr/include/dpdk/rte_mbuf.h:38,
                 from ../lib/dp-packet.h:25,
                 from ../lib/dpif.h:380,
                 from ../lib/dpif-netdev.h:23,
                 from ../lib/dpif-netdev-lookup-avx512-gather.c:22:

The OVS build should auto-detect this incompatible configuration and disable the AVX512 optimizations. It would also be good to have a switch to manually do this at configure time.

istokes commented 3 years ago

Thanks for highlighting this.

@Van Haaren, Harrymailto:harry.van.haaren@intel.com, have you seen this issue previously or thoughts on how to address?

Regards Ian

From: Felix Moessbauer notifications@github.com Sent: Thursday, January 7, 2021 4:05 PM To: openvswitch/ovs-issues ovs-issues@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [openvswitch/ovs-issues] OVS 2.14 build fails if DPDK is build with -mno-avx512f (#201)

In OVS 2.14, AVX512 support has been enabled / introduced. Support for that is auto-detected based compiler support.

Also, the flags DPDK has been compiled with are inspected using the package config file. If DPDK was compiled without AVX512 support (e.g. by -march=corei7 or -mno-avx512f), OVS cannot be build due to the following error:

/usr/lib/gcc/x86_64-linux-gnu/8/include/avx512fintrin.h:6216:1: error: inlining failed in call to always_inline ‘_mm512_maskz_loadu_epi64’: target specific option mismatch

_mm512_maskz_loadu_epi64 (mmask8 U, void const *__P)

^~~~~~~~

../lib/dpif-netdev-lookup-avx512-gather.c:79:24: note: called from here

 __m512i v_mask   = _mm512_maskz_loadu_epi64(lane_mask, &maskp[0]);

                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In file included from /usr/lib/gcc/x86_64-linux-gnu/8/include/immintrin.h:45,

             from /usr/lib/gcc/x86_64-linux-gnu/8/include/x86intrin.h:48,

             from /usr/include/dpdk/../x86_64-linux-gnu/dpdk/rte_vect.h:28,

             from /usr/include/dpdk/../x86_64-linux-gnu/dpdk/rte_memcpy.h:17,

             from /usr/include/dpdk/rte_mempool.h:51,

             from /usr/include/dpdk/rte_mbuf.h:38,

             from ../lib/dp-packet.h:25,

             from ../lib/dpif.h:380,

             from ../lib/dpif-netdev.h:23,

             from ../lib/dpif-netdev-lookup-avx512-gather.c:22:

The OVS build should auto-detect this incompatible configuration and disable the AVX512 optimizations. It would also be good to have a switch to manually do this at configure time.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/openvswitch/ovs-issues/issues/201, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACMDYEN3F3CLB3XQZ6GW7LTSYXLRDANCNFSM4VZFVWAA.

istokes commented 3 years ago

Hey Folks,

Seems indeed like an incompatibility between DPDK configuration and OVS configuration. I'm not sure exactly how OVS build-system is picking up the DPDK -march= or -mno-avx512f flags. Felix: would you describe what versions of each project you're using, and your configure/compile commands?

Regarding a switch to do this at configure time, using -march=X or -mno-avx512f for OVS's CFLAGS when running ./configure is expected to work. Does building OVS using the same CFLAGS as DPDK work?

Regards, -Harry

harry-van-haaren commented 3 years ago

Hi Folks,

Somehow the above email showed up as Ian Stokes. I've now created an account on Github, so can reply here directly.

@fmoessbauer; the above questions in https://github.com/openvswitch/ovs-issues/issues/201#issuecomment-756243514 are from me, and would help to identify/root-cause the problem here.

Regards, -Harry

igsilya commented 3 years ago

I guess, it's kind of a same problem that we tried to fix by stripping out -march provided by DPDK. I'm also guessing that DPDK adds -mno-avx512f to cflags provided via pkg-config. This sounds even less right than adding -march there. The problem is that OVS adds flags provided by DPDK to the end of cflags, so they override everything that OVS itself or user of the OVS configured. I'd not expect setting CFLAGS for ./configure to fix the issue. Something like this might fix it, though:

diff --git a/acinclude.m4 b/acinclude.m4
index 60871f67a..84a0a0845 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -445,7 +445,7 @@ AC_DEFUN([OVS_CHECK_DPDK], [
     # forces in pkg-config since this could override user-specified options.
     # It's enough to have -mssse3 to build with DPDK headers.
     DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-march=[[^ ]]*//g')
-    OVS_CFLAGS="$OVS_CFLAGS $DPDK_INCLUDE"
+    OVS_CFLAGS="$DPDK_INCLUDE $OVS_CFLAGS"
     OVS_ENABLE_OPTION([-mssse3])

     # DPDK pmd drivers are not linked unless --whole-archive is used.
fmoessbauer commented 3 years ago

Hi Folks,

I use DPKD 19.11 from Debian buster backports (19.11-4~bpo10+1), OVS 2.14 from this repo. There, a pkg config file is placed in /usr/lib/x86_64-linux-gnu/pkgconfig/libdpdk.pc. This includes the mentioned flags:

prefix=/usr
libdir=${prefix}/lib/x86_64-linux-gnu
includedir=${prefix}/include/dpdk

Name: DPDK
Description: The Data Plane Development Kit (DPDK).
Note that CFLAGS might contain an -march flag higher than typical baseline.
This is required for a number of static inline functions in the public headers.
Version: 19.11.0
Requires: libbsd
Requires.private: zlib, libmlx4, libibverbs, libmlx5, libcrypto, jansson, libelf
Libs: -L${libdir} -lrte_telemetry [... a ton of rte libs]
Cflags: -I${includedir}/../x86_64-linux-gnu/dpdk -I${includedir} -include rte_config.h -march=corei7 -mno-avx512f

And as guessed correctly, there is the -march and -mno-avx512f flags. Not sure if that is intended or an error in DPDK or the debianization.

For building, I just use ./configure. @igsilya your patch compiles (at least in my configuration), but I'm not sure if that is a good solution. IMO we should not depend on the order of the flags.

Anyways, I'll try if that also runs successfully and report back. Thanks for the help!

harry-van-haaren commented 3 years ago

Thanks for reporting back Felix, it seems that this is not something in the OVS code that we can handle better (except for Ilya's patch overriding DPDK things, maybe).

I know a lot has improved in DPDK 20.11 for pkg-config file generation (its now using Meson, and the .pc file is generated from that instead of at the packaging stage). I expect that latest versions of OVS/DPDK shouldn't have this issue anymore due to build-system/integration improvements.

fmoessbauer commented 3 years ago

Unfortunately did respond too fast, as my build was without DPDK... @igsilya When building with DPDK, at least for 2.14 the patch does not solve the issue:

First we get the warnings about the incompatible ABI:

  /usr/bin/python3 ../build-aux/dpdkstrip.py --dpdk | \
  sed \
    -e 's,[@]PKIDIR[@],/var/lib/openvswitch/pki,g' \
    -e 's,[@]LOGDIR[@],/var/log/openvswitch,g' \
    -e 's,[@]DBDIR[@],/etc/openvswitch,g' \
    -e 's,[@]PYTHON3[@],/usr/bin/python3,g' \
    -e 's,[@]RUNDIR[@],/var/run/openvswitch,g' \
    -e 's,[@]VERSION[@],2.14.0,g' \
    -e 's,[@]localstatedir[@],/var,g' \
    -e 's,[@]pkgdatadir[@],/usr/share/openvswitch,g' \
    -e 's,[@]sysconfdir[@],/etc,g' \
    -e 's,[@]bindir[@],/usr/bin,g' \
    -e 's,[@]sbindir[@],/usr/sbin,g' \
    -e 's,[@]abs_builddir[@],/home/z0040swb/source/ovs-salsa/_dpdk,g' \
    -e 's,[@]abs_top_srcdir[@],/home/z0040swb/source/ovs-salsa/_dpdk/..,g' \
  > utilities/ovs-l3ping.tmp
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I.. -I ../include -I ./include -I ../lib -I ./lib -Wdate-time -D_FORTIFY_SOURCE=2 -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast -Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-field-initializers -fno-strict-aliasing -Wswitch-bool -Wlogical-not-parentheses -Wsizeof-array-argument -Wbool-compare -Wshift-negative-value -Wduplicated-cond -Wshadow -Wmultistatement-macros -Wcast-align=strict -mssse3 -include rte_config.h -mno-avx512f -I/usr/include/dpdk/../x86_64-linux-gnu/dpdk -I/usr/include/dpdk -I/usr/include/libnl3 -D_FILE_OFFSET_BITS=64 -g -O2 -fdebug-prefix-map=/home/z0040swb/source/ovs-salsa=. -fstack-protector-strong -Wformat -Werror=format-security -DHAVE_AVX512F -DHAVE_LD_AVX512_GOOD -MT lib/tc.lo -MD -MP -MF lib/.deps/tc.Tpo -c ../lib/tc.c -o lib/tc.o >/dev/null 2>&1
PYTHONPATH=$PYTHONPATH":"../python /usr/bin/python3 ../build-aux/soexpand.py -I.. < ../utilities/ovs-parse-backtrace.in | \
  /usr/bin/python3 ../build-aux/dpdkstrip.py --dpdk | \
  sed \
    -e 's,[@]PKIDIR[@],/var/lib/openvswitch/pki,g' \
    -e 's,[@]LOGDIR[@],/var/log/openvswitch,g' \
    -e 's,[@]DBDIR[@],/etc/openvswitch,g' \
    -e 's,[@]PYTHON3[@],/usr/bin/python3,g' \
    -e 's,[@]RUNDIR[@],/var/run/openvswitch,g' \
    -e 's,[@]VERSION[@],2.14.0,g' \
    -e 's,[@]localstatedir[@],/var,g' \
    -e 's,[@]pkgdatadir[@],/usr/share/openvswitch,g' \
    -e 's,[@]sysconfdir[@],/etc,g' \
    -e 's,[@]bindir[@],/usr/bin,g' \
    -e 's,[@]sbindir[@],/usr/sbin,g' \
    -e 's,[@]abs_builddir[@],/home/z0040swb/source/ovs-salsa/_dpdk,g' \
    -e 's,[@]abs_top_srcdir[@],/home/z0040swb/source/ovs-salsa/_dpdk/..,g' \
  > utilities/ovs-parse-backtrace.tmp
../lib/dpif-netdev-lookup-avx512-gather.c: In function ‘_mm512_popcnt_epi64_manual’:
../lib/dpif-netdev-lookup-avx512-gather.c:45:1: warning: AVX512F vector return without AVX512F enabled changes the ABI [-Wpsabi]

Then, the same errors as previously:

In file included from /usr/lib/gcc/x86_64-linux-gnu/8/include/immintrin.h:55,
                 from /usr/lib/gcc/x86_64-linux-gnu/8/include/x86intrin.h:48,
                 from /usr/include/dpdk/../x86_64-linux-gnu/dpdk/rte_vect.h:28,
                 from /usr/include/dpdk/../x86_64-linux-gnu/dpdk/rte_memcpy.h:17,
                 from /usr/include/dpdk/rte_mempool.h:51,
                 from /usr/include/dpdk/rte_mbuf.h:38,
                 from ../lib/dp-packet.h:25,
                 from ../lib/dpif.h:380,
                 from ../lib/dpif-netdev.h:23,
                 from ../lib/dpif-netdev-lookup-avx512-gather.c:22:
/usr/lib/gcc/x86_64-linux-gnu/8/include/avx512bwintrin.h:413:1: error: inlining failed in call to always_inline ‘_mm512_sad_epu8’: target specific option mismatch
 _mm512_sad_epu8 (__m512i __A, __m512i __B)

Would it be possible to fully disable the AVX512 support, instead of playing around with cpu features?

This issue has more aspects to consider in the larger picture:

igsilya commented 3 years ago

@fmoessbauer , I see. Reordering of flags is really unpredictable here. That all is really sketchy. To be honest, I'd not recommend to anyone to use pkg-config way of building with DPDK for any already released version of OVS (i.e. below 2.15). With 2.15 and DPDK 20.11 it should be somewhat usable, but I guess, there will be the same avx512 related issue that we should fix even there. DPDK forces too many flags in pkg-config that it shouldn't. Here is the long recent discussion about -march flag: https://mail.openvswitch.org/pipermail/ovs-dev/2020-December/378479.html . It ended with us stripping out -march from cflags provided by DPDK and I think we should do the same for avx512 flags:

diff --git a/acinclude.m4 b/acinclude.m4
index 60871f67a..b73526b60 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -445,6 +445,12 @@ AC_DEFUN([OVS_CHECK_DPDK], [
     # forces in pkg-config since this could override user-specified options.
     # It's enough to have -mssse3 to build with DPDK headers.
     DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-march=[[^ ]]*//g')
+    # Also stripping out '-mno-avx512f'.  Support for AVX512 will be disabled
+    # if OVS will detect that it's broken.  OVS could be built with a
+    # completely different toolchain that correctly supports AVX512, flags
+    # forced by DPDK only breaks our feature detection mechanism and leads to
+    # build failures: https://github.com/openvswitch/ovs-issues/issues/201
+    DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-mno-avx512f//g')
     OVS_CFLAGS="$OVS_CFLAGS $DPDK_INCLUDE"
     OVS_ENABLE_OPTION([-mssse3])
igsilya commented 3 years ago

Just tested with DPDK latest main branch and I see that -mno-avx512f ends up in libdpdk-libs.pc. I'll send above stripping patch formally as it's required even for master branch. I'll also add DPDK and Debian maintainers in copy.

istokes commented 3 years ago

@fmoessbauer , I see. Reordering of flags is really unpredictable here. That all is really sketchy. To be honest, I'd not recommend to anyone to use pkg-config way of building with DPDK for any already released version of OVS (i.e. below 2.15). With 2.15 and DPDK 20.11 it should be somewhat usable, but I guess, there will be the same avx512 related issue that we should fix even there. DPDK forces too many flags in pkg-config that it shouldn't. Here is the long recent discussion about -march flag: https://mail.openvswitch.org/pipermail/ovs-dev/2020-December/378479.html . It ended with us stripping out -march from cflags provided by DPDK and I think we should do the same for avx512 flags:

diff --git a/acinclude.m4 b/acinclude.m4
index 60871f67a..b73526b60 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -445,6 +445,12 @@ AC_DEFUN([OVS_CHECK_DPDK], [
     # forces in pkg-config since this could override user-specified options.
     # It's enough to have -mssse3 to build with DPDK headers.
     DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-march=[[^ ]]*//g')
+    # Also stripping out '-mno-avx512f'.  Support for AVX512 will be disabled
+    # if OVS will detect that it's broken.  OVS could be built with a
+    # completely different toolchain that correctly supports AVX512, flags
+    # forced by DPDK only breaks our feature detection mechanism and leads to
+    # build failures: https://github.com/openvswitch/ovs-issues/issues/201
+    DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-mno-avx512f//g')
     OVS_CFLAGS="$OVS_CFLAGS $DPDK_INCLUDE"
     OVS_ENABLE_OPTION([-mssse3])

I think that's fair. I think this is an issue with switching build systems between DPDK releases in particular issues around the pkg-config file had to be generated between the releases. It's something that should be fixed for sure in this case.

igsilya commented 3 years ago

Patch is available here: https://patchwork.ozlabs.org/project/openvswitch/patch/20210108114656.3087152-1-i.maximets@ovn.org/ It's for a master branch, but it could be applied to 2.14 with minor conflict resolution. @fmoessbauer , could you, please, test it on your setup?

fmoessbauer commented 3 years ago

@igsilya Thanks for providing this patch. I ported it back to 2.14 and successfully tested it in the following scenario:

The OVS machine CPU has AVX512 support:

arch_capabilities
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi ept vpid ept_ad fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves flush_l1d

When I'm back in the office I'll test this patch on another machine without AVX512 CPUs. So at least in this rather simple scenario it works properly. Anyways, the comments on the patch on the mailing list regarding ABI are also worth considering.

Here's the patch for OVS 2.14

diff --git a/acinclude.m4 b/acinclude.m4
index 84f344da0..d87875150 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -436,6 +436,15 @@ AC_DEFUN([OVS_CHECK_DPDK], [
     if test "$DPDK_AUTO_DISCOVER" = "false"; then
       OVS_LDFLAGS="$OVS_LDFLAGS -L$DPDK_LIB_DIR"
     fi
+    # forces in pkg-config since this could override user-specified options.
+    # It's enough to have -mssse3 to build with DPDK headers.
+    DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-march=[[^ ]]*//g')
+    # Also stripping out '-mno-avx512f'.  Support for AVX512 will be disabled
+    # if OVS will detect that it's broken.  OVS could be built with a
+    # completely different toolchain that correctly supports AVX512, flags
+    # forced by DPDK only breaks our feature detection mechanism and leads to
+    # build failures: https://github.com/openvswitch/ovs-issues/issues/201
+    DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-mno-avx512f//g')
     OVS_CFLAGS="$OVS_CFLAGS $DPDK_INCLUDE"
     OVS_ENABLE_OPTION([-mssse3])