ipdk-io / ipdk

Infrastructure Programmer Development Kit (IPDK) is an open source, vendor agnostic framework of drivers and APIs for infrastructure offload and management that runs on a CPU, IPU, DPU or switch.
Apache License 2.0
185 stars 68 forks source link

ovs-p4ctl built with GitHub Actions crash with a segfault due to AVX512 support being enabled when run on Virtualbox or other systems lacking AVX support #118

Open mestery opened 2 years ago

mestery commented 2 years ago

I just spent a few hours looking into an issue which turns out to be caused by the fact the images we are building and pushing to GHCR do not work.

Pull the GHCR Ubuntu 20.04 image like this:

docker pull ghcr.io/ipdk-io/ipdk-ubuntu2004-x86_64:main

Next, setup your ipdk.env file as shown:

vagrant@ubuntu-focal:~/ipdk/build$ cat ~/.ipdk/ipdk.env
#
# IPDK CLI configuration file
# See for more information https://github.com/ipdk-io/ipdk/blob/main/build/IPDK_Container/README_DOCKER.md#CLI-configuration-settings
#
# /home/vagrant/ipdk/build/scripts contains the location of the 'sourcing' ipdk script when read.
#
BASE_IMG=ubuntu:20.04
IMAGE_NAME=ghcr.io/ipdk-io/ipdk-ubuntu2004-x86_64
DOCKERFILE=/home/vagrant/ipdk/build/scripts/../Dockerfile.ubuntu
TAG=main
vagrant@ubuntu-focal:~/ipdk/build$

Star the container:

$ ipdk start -d

Login and see how ovs-vswitchd is not running, and when you start it manually, you see a segfault:

vagrant@ubuntu-focal:~/ipdk/build$ docker exec -it ipdk bash
root@f111597b0a04:~# ps axuw
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0   3976  3244 pts/0    Ss+  20:47   0:00 /bin/bash /root/scripts/start.sh rundaemon
root          45  0.0  0.0   2520   584 pts/0    S+   20:47   0:00 sleep infinity
root          95  0.0  0.0   5300  2708 ?        Ss   20:47   0:00 ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,ma
root         100  0.5  0.0   4240  3668 pts/1    Ss   20:50   0:00 bash
root         114  0.0  0.0   5896  2836 pts/1    R+   20:50   0:00 ps axuw
root@f111597b0a04:~# /root/scripts/start.sh rundaemon
Start as long running process.
1024
1024
Killing OVS Processes If Already Running....
Killing ovsdb-server process....
Creating /var/run/openvswitch
Starting OvS DB server....
Staring OvS VSWITCHD Process....
/root/scripts/run_ovs.sh: line 45:   153 Illegal instruction     (core dumped) GLOG_log_dir=/tmp/logs ovs-vswitchd --pidfile --detach --no-chdir --mlockall --log-file=/tmp/logs/ovs-vswitchd.log
^C
root@f111597b0a04:~#

Now, I have pulled down the IPDK code and built a container locally, and when I run that image, it works fine:

vagrant@ubuntu-focal:~/ipdk/build$ docker images
REPOSITORY                               TAG           IMAGE ID       CREATED          SIZE
ghcr.io/ipdk-io/ipdk-ubuntu2004-x86_64   sha-bb7a773   9141307b21db   10 minutes ago   3.23GB
ghcr.io/ipdk-io/ipdk-ubuntu2004-x86_64   main          840c958df8e8   30 hours ago     3.23GB
ubuntu                                   20.04         54c9d81cbb44   3 weeks ago      72.8MB
vagrant@ubuntu-focal:~/ipdk/build$ cat ~/.ipdk/ipdk.env
#
# IPDK CLI configuration file
# See for more information https://github.com/ipdk-io/ipdk/blob/main/build/IPDK_Container/README_DOCKER.md#CLI-configuration-settings
#
# /home/vagrant/ipdk/build/scripts contains the location of the 'sourcing' ipdk script when read.
#
BASE_IMG=ubuntu:20.04
IMAGE_NAME=ghcr.io/ipdk-io/ipdk-ubuntu2004-x86_64
DOCKERFILE=/home/vagrant/ipdk/build/scripts/../Dockerfile.ubuntu
vagrant@ubuntu-focal:~/ipdk/build$ ipdk start -d
Loaded /home/vagrant/ipdk/build/scripts/ipdk_default.env
Loaded /home/vagrant/.ipdk/ipdk.env
Can't find update-binfmts.
Using docker run!
b9aee9f0aa34e94785023274592a48b9b3bd1e5e2d2f7b3fce6c2a8435e766ba
vagrant@ubuntu-focal:~/ipdk/build$ docker exec -it ipdk bash
root@b9aee9f0aa34:~# ps axuw
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.1  0.0   3976  3096 pts/0    Ss+  20:50   0:00 /bin/bash /root/scripts/start.sh rundaemon
root          41  0.0  0.0   5300  2676 ?        Ss   20:50   0:00 ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,ma
root          45 97.0  2.3 67827184 194716 ?     SLsl 20:50   0:07 ovs-vswitchd --pidfile --detach --no-chdir --mlockall --log-file=/tmp/logs/ovs-vswitchd.log
root          63  0.0  0.0   2520   596 pts/0    S+   20:50   0:00 sleep infinity
root          69  0.0  0.0   4240  3488 pts/1    Ss   20:50   0:00 bash
root          83  0.0  0.0   5896  2876 pts/1    R+   20:50   0:00 ps axuw
root@b9aee9f0aa34:~#
mestery commented 2 years ago

@stolsma for your review

stolsma commented 2 years ago

Hmmm, don't understand... At my computer it just works? Don't have a lot of time tomorrow to dig into it. ☹️Will go full force on it on monday...

mestery commented 2 years ago

I'll take a look at this later today after some meetings are finished.

For reference, here is what my CPU is on the VM I'm trying to run the container:

vagrant@ubuntu-focal:~/ipdk/build$ cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 126
model name  : Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
stepping    : 5
cpu MHz     : 2304.000
cache size  : 8192 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 4
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 22
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti tpr_shadow flexpriority fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d arch_capabilities
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips    : 4608.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 126
model name  : Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
stepping    : 5
cpu MHz     : 2304.000
cache size  : 8192 KB
physical id : 0
siblings    : 4
core id     : 1
cpu cores   : 4
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 22
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti tpr_shadow flexpriority fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d arch_capabilities
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips    : 4608.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor   : 2
vendor_id   : GenuineIntel
cpu family  : 6
model       : 126
model name  : Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
stepping    : 5
cpu MHz     : 2304.000
cache size  : 8192 KB
physical id : 0
siblings    : 4
core id     : 2
cpu cores   : 4
apicid      : 2
initial apicid  : 2
fpu     : yes
fpu_exception   : yes
cpuid level : 22
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti tpr_shadow flexpriority fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d arch_capabilities
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips    : 4608.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model       : 126
model name  : Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
stepping    : 5
cpu MHz     : 2304.000
cache size  : 8192 KB
physical id : 0
siblings    : 4
core id     : 3
cpu cores   : 4
apicid      : 3
initial apicid  : 3
fpu     : yes
fpu_exception   : yes
cpuid level : 22
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti tpr_shadow flexpriority fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d arch_capabilities
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips    : 4608.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

vagrant@ubuntu-focal:~/ipdk/build$
mestery commented 2 years ago

Running strace on ovs-vswitchd in the container results in this:

mprotect(0x7fc7765ef000, 4096, PROT_READ) = 0
munmap(0x7fc7765b7000, 43358)           = 0
set_tid_address(0x7fc7734d46d0)         = 99
set_robust_list(0x7fc7734d46e0, 24)     = 0
rt_sigaction(SIGRTMIN, {sa_handler=0x7fc775484bf0, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7fc7754923c0}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {sa_handler=0x7fc775484c90, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x7fc7754923c0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
brk(NULL)                               = 0x55cf9f087000
brk(0x55cf9f0a8000)                     = 0x55cf9f0a8000
getrandom("\x79\x67\x2c\x34\x56\x58\x52\x8a", 8, 0) = 8
--- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x7fc773632b4b} ---
+++ killed by SIGILL (core dumped) +++
Illegal instruction (core dumped)
root@bdb91ca96701:~#
mestery commented 2 years ago

Hmmmmm.... it looks like the CPUs we're building DPDK with support AVX512, which I know my local CPU does not. I wonder if this is the issue?

2022-02-23T14:33:47.3453979Z #17 428.0 Fetching value of define "__AVX2__" : 1 (cached)
2022-02-23T14:33:48.5429862Z #17 429.1 Fetching value of define "__AVX512F__" : 1 (cached)
2022-02-23T14:33:48.5430200Z #17 429.1 Fetching value of define "__AVX512BW__" : 1 (cached)
2022-02-23T14:33:48.5430979Z #17 429.1 Compiler for C supports arguments -mavx512f: YES (cached)
2022-02-23T14:33:48.5431387Z #17 429.1 Compiler for C supports arguments -mavx512bw: YES (cached)
2022-02-23T14:33:48.5431786Z #17 429.1 Compiler for C supports arguments -march=skylake-avx512: YES
2022-02-23T14:33:48.5432097Z #17 429.1 Fetching value of define "__AVX2__" : 1 (cached)
2022-02-23T14:33:48.5432357Z #17 429.1 Fetching value of define "__AVX512F__" : 1 (cached)
2022-02-23T14:33:48.5535212Z #17 429.1 Fetching value of define "__AVX512BW__" : 1 (cached)
2022-02-23T14:33:48.5537485Z #17 429.1 Compiler for C supports arguments -mavx512f: YES (cached)
2022-02-23T14:33:48.5539651Z #17 429.1 Compiler for C supports arguments -mavx512bw: YES (cached)
2022-02-23T14:33:48.5541813Z #17 429.1 Compiler for C supports arguments -march=skylake-avx512: YES (cached)
2022-02-23T14:33:48.5544007Z #17 429.1 Compiler for C supports arguments -Wno-unused-value: YES (cached)
2022-02-23T14:33:48.5546223Z #17 429.1 Compiler for C supports arguments -Wno-unused-but-set-variable: YES (cached)
2022-02-23T14:33:48.5548418Z #17 429.1 Compiler for C supports arguments -Wno-unused-variable: YES (cached)
2022-02-23T14:33:48.5550604Z #17 429.1 Compiler for C supports arguments -Wno-unused-parameter: YES (cached)
mestery commented 2 years ago

@Namrata-intel Do you know how to compile DPDK without AVX512 support, maybe move back to SSE2 instead, which is more broadly supported across most CPUs?

Namrata-intel commented 2 years ago

You can try unsetting this in Configs and build. It seems to be configurable. I have never tried it. CONFIG_RTE_ENABLE_AVX=y CONFIG_RTE_ENABLE_AVX512=n

mestery commented 2 years ago

@Namrata-intel Any pointers to where exactly in the build I would set this?

mestery commented 2 years ago

I think I figured it out, so I forked p4-dpdk-target to try my fix from here: https://github.com/ipdk-io/p4-dpdk-target/commit/cd1ee1ed225e84ee779ff53c8d6daaf8ab32d2c1 If this works, I'll push a PR to the p4-dpdk-target repository.

mestery commented 2 years ago

Well, my attempt at fixing isn't working. The layers of build options, nested in git submodules with hard coded build commands, are making this challenging for me. If anyone can help to use a more generic set of CPU options which will work in virtual machines (VirtualBox does not support AVX512, for example), that would be amazing. Until then, we have to build locally and can't use the images in GHCR.

mestery commented 2 years ago

This should fix it: https://github.com/p4lang/p4-dpdk-target/pull/18

@Namrata-intel Do you know who can review that patch to p4-dpdk-target to disable AVX512?