panic under heavy network load

dch commented 1 year ago

this only reproduces when more than usual cross-dpaa interface traffic is present. I can trigger it using iperf3 reliably. This is using normal CURRENT, not fork.

$ iperf3 --parallel 16 --client 172.16.2.24  --get-server-output --time 120
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec  2.37 MBytes  19.7 Mbits/sec  149   42.1 KBytes
[  7]   0.00-1.01   sec  2.39 MBytes  19.9 Mbits/sec  133   5.28 KBytes
[  9]   0.00-1.01   sec  3.26 MBytes  27.1 Mbits/sec  210   44.8 KBytes
[ 11]   0.00-1.01   sec  2.21 MBytes  18.4 Mbits/sec   53   2.63 KBytes
[ 13]   0.00-1.01   sec  2.28 MBytes  19.0 Mbits/sec  114   1.32 KBytes
[ 15]   0.00-1.01   sec  2.33 MBytes  19.3 Mbits/sec   75   2.66 KBytes
[ 17]   0.00-1.01   sec  2.53 MBytes  21.0 Mbits/sec  201   86.9 KBytes
[ 19]   0.00-1.01   sec  2.58 MBytes  21.5 Mbits/sec  119   90.8 KBytes
[ 21]   0.00-1.01   sec  2.55 MBytes  21.2 Mbits/sec  102    134 KBytes
[ 23]   0.00-1.01   sec  2.61 MBytes  21.7 Mbits/sec   31   1.32 KBytes
[ 25]   0.00-1.01   sec  2.27 MBytes  18.9 Mbits/sec   76   1.32 KBytes
[ 27]   0.00-1.01   sec  2.53 MBytes  21.1 Mbits/sec   88    147 KBytes
[ 29]   0.00-1.01   sec  2.48 MBytes  20.6 Mbits/sec   94   1.32 KBytes
[ 31]   0.00-1.01   sec  2.54 MBytes  21.1 Mbits/sec   23   1.32 KBytes
[ 33]   0.00-1.01   sec  2.56 MBytes  21.3 Mbits/sec  113   47.4 KBytes
[ 35]   0.00-1.01   sec  2.48 MBytes  20.7 Mbits/sec   92    161 KBytes
[SUM]   0.00-1.01   sec  40.0 MBytes   333 Mbits/sec  1673

  x0:                0
  x1: ffff00010d052200
  x2: ffff0000009e0078 (console_pausestr + 25688)
  x3:              30d
  x4:                0
  x5:                d
  x6:  a009b033eaa2d8c
  x7:    8172b24fa0a00
  x8: ffff000114a9d000
  x9:                0
 x10:                1
 x11:                3
 x12:                1
 x13:                0
 x14:            10000
 x15:                1
 x16:            10000
 x17: ffff0001737b1958 (ng_unref_node + 0)
 x18: ffff00010e4806d0
 x19: ffff0001146d9000
 x20: ffff0001146d9058
 x21: ffffa000024e1200
 x22:                0
 x23: ffff00010e480750
 x24: ffffa000031db680
 x25: ffffa000031c2c00
 x26: ffff000000cac018 (Giant + 18)
 x27: ffff000000961b27 (digits + 227b9)
 x28: ffffa000031c2c10
 x29: ffff00010e4806d0
  sp: ffff00010e4806d0
  lr: ffff00000081afc0 (dpaa2_ni_poll + 3c)
 elr: ffff00000081b028 (dpaa2_ni_poll + a4)
spsr:         40000045
 far:                1
 esr:         96000004
panic: vm_fault failed: ffff00000081b028 error 1
cpuid = 7
time = 1679392819
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x13c
panic() at panic+0x44
data_abort() at data_abort+0x32c
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0
(null)() at 0x10000
KDB: enter: panic
[ thread pid 12 tid 100119 ]
Stopped at      kdb_enter+0x44: undefined       f906427f
db>
Tracing pid 12 tid 100119 td 0xffff00010d052200
db_trace_self() at db_trace_self
db_stack_trace() at db_stack_trace+0x11c
db_command() at db_command+0x2d8
db_command_loop() at db_command_loop+0x54
db_trap() at db_trap+0xf8
kdb_trap() at kdb_trap+0x28c
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0
(null)() at 0
db>

output of while true; vmstat -i | grep dpaa2_io; sleep 1; end
and top -SjwHPz -mcpu at moment of crash (tmux over mosh)

its0,43: dpaa2_io0                                 25732021        399
its0,44: dpaa2_io1                                  4262516         66
its0,45: dpaa2_io2                                  4645361         72
its0,46: dpaa2_io3                                  4869407         76
its0,47: dpaa2_io4                                  4506983         70
its0,48: dpaa2_io5                                  4257987         66
its0,49: dpaa2_io6                                  3052330         47
its0,50: dpaa2_io7                                  2853862         44

436 threads:   9 running, 373 sleeping, 54 waiting
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 2:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 3:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
CPU 4:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
CPU 5:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 6:  0.0% user,  0.0% nice,  100% system,  0.0% interrupt,  0.0% idle
CPU 7:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
Mem: 122M Active, 5184M Inact, 4528M Wired, 40K Buf, 21G Free
ARC: 2678M Total, 388M MFU, 1623M MRU, 1153K Anon, 53M Header, 607M Other
     1706M Compressed, 3914M Uncompressed, 2.29:1 Ratio
Swap: 4096M Total, 4096M Free

  PID   JID USERNAME2   PRI NICE   SIZE    RES SWAP STATE    C   TIME    WCPU COMMAND
21012     0 root         21    0    17M  4516K   0B CPU3     3   0:00 100.00% top
52722     0 dch          20    0    28M    17M   0B select   4   0:02  44.43% mosh-server
   12     0 root        -64    -     0B   736K   0B WAIT     4   4:24  21.61% intr{its0,46: dpaa2_io3}
   12     0 root        -64    -     0B   736K   0B WAIT     7   7:47  21.43% intr{its0,43: dpaa2_io0}
85664     0 dch          20    0    14M  5016K   0B select   1   0:01  13.89% tmux
   12     0 root        -64    -     0B   736K   0B WAIT     5   4:12  13.20% intr{its0,45: dpaa2_io2}
   12     0 root        -64    -     0B   736K   0B WAIT     0   2:37  10.00% intr{its0,50: dpaa2_io7}
   12     0 root        -64    -     0B   736K   0B WAIT     2   3:33   9.24% intr{its0,48: dpaa2_io5}
   12     0 root        -64    -     0B   736K   0B WAIT     3   4:37   8.09% intr{its0,47: dpaa2_io4}
   12     0 root        -64    -     0B   736K   0B WAIT     6   3:42   3.56% intr{its0,44: dpaa2_io1}
    0     0 root        -64    -     0B  2400K   0B -        1   0:14   2.45% kernel{dpaa2_ni1_tqbp}
59646     1    317       20    0   363M   128M   0B kqread   6   2:27   1.19% node{node}
    0     0 root        -64    -     0B  2400K   0B -        2   0:17   0.38% kernel{dpaa2_ni2_tqbp}
20183     0 root         20    0    33M    14M   0B select   6   2:39   0.00% zerotier-one{zerotier-one}
   12     0 root        -64    -     0B   736K   0B WAIT     1   2:38   0.00% intr{its0,49: dpaa2_io6}
44098     0 root         20    0  1674M   537M   0B kqread   2   2:05   0.00% kresd
    6     0 root         -8    -     0B  1360K   0B tx->tx   1   1:54   0.00% zfskern{txg_thread_enter}
    2     0 root        -60    -     0B   128K   0B WAIT     1   1:44   0.00% clock{clock (0)}
19910     0 root        -16    -     0B    16K   0B pftm     1   1:38   0.00% pf purge

dsalychev commented 1 year ago

@dch thanks for all of the details! Btw, was it GENERIC kernel you tested on?

dch commented 1 year ago

On Thu, 23 Mar 2023, at 10:54, dsl wrote:

@dch https://github.com/dch thanks for all of the details! Btw, was it GENERIC kernel you tested on?

yes.

dch commented 1 year ago

I've been seeing this a lot (like every 5-10 minutes) after 0d574d8ba8b244f40c1484123c5042f49ac642b8 with https://reviews.freebsd.org/D40094 sometimes so, early ten64 doesn't even complete switch to userland. May be a generic arm64 issue, but I'm not seeing this on other h/w pushing a lot more traffic.

dsalychev commented 1 year ago

@dch I'm not sure about https://github.com/mcusim/freebsd-src/commit/0d574d8ba8b244f40c1484123c5042f49ac642b8, but I've modified addresses translation recently. Could you try https://github.com/mcusim/freebsd-src/commit/718bdb6a71ba4ed1f557f89af1482a10f7b1cb74 and one before https://github.com/mcusim/freebsd-src/commit/74192f9b2d240edbd72215b8ee770485502ce8ee?

dch commented 1 year ago

sorry it took a while but 718bdb6a71ba4ed1f557f89af1482a10f7b1cb74 is the culprit. Reverting this & we're all ok again.

markmi commented 1 year ago

sorry it took a while but 718bdb6 is the culprit. Reverting this & we're all ok again.

The original report here is from Mar 22 but that commit is from May 11. Time relationship seems wrong for 718bdb6 to be the only issue.

dch commented 1 year ago

Correct, I thought that was clear from the original title & updated comment.

the symptom is the same, under load from 1 physical interface to another, I get the vm_fault failed panic
with 718bdb6 included, panics are frequent, every 5-10 minutes
with that reverted, they occur roughly weekly, with autoreboot this is very usable

dsalychev commented 1 year ago

@dch Thanks for a summary, that's how I understood the issue. Its root cause is in the different channels accessing bus_dma resources concurrently, I assume. You won't see those panics with the only channel up and running. Just FYI, I'm trying to isolate channels within their own tasks and limit an access to shared resources as much as possible.

dsalychev commented 1 year ago

@dch I've prepared a lot of changes in the https://github.com/mcusim/freebsd-src/tree/dpaa2 branch. Could you try it? GENERIC kernel had worked for me under high network load for ~14 hours when I stopped the test myself. Btw, I've also discovered that the kernel panics with "undefined instruction" when the Ten64's SoC is heated up to 80-90C (sysctl hw.temperature). Please, keep an eye on it.

dsalychev commented 1 year ago

It should be fixed on CURRENT with https://cgit.freebsd.org/src/commit/?id=58983e4b0253ad38a3e1ef2166fedd3133fdb552 merged in.

dch commented 1 year ago

so far LGTM on 15.0-CURRENT - a 3h test (albeit on 1G ifaces only) is stable. awesome! I need to move some cabling around for 10G but this is great progress!

thanks @dsalychev

pkubaj commented 1 year ago

I'm on stable/14 and am planning to switch to releng/14.0 when it's branched off, but it also seems stable. But regarding SFP+ ports, I'm not able to connect to them. I have Intel X520-DA2 card:

ix0@pci0:1:0:0: class=0x020000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x10fb subvendor=0x8086 subdevice=0x7a11
    vendor     = 'Intel Corporation'
    device     = '82599ES 10-Gigabit SFI/SFP+ Network Connection'
    class      = network
    subclass   = ethernet

It's able to link up when plugged in via loopback, but not when I plug in to Ten64. I haven't reported it yet, because I still haven't tested it working under Linux.

dsalychev commented 1 year ago

@dch, @pkubaj thanks for all of the tests. Please, don't expect SFP+ to be operational anyhow at the moment. I've just started working on a design of something I call "sffbus" (similar to miibus(4)).

dch commented 1 year ago

using e04c4b4a369df3f1dcbebbdf726193f02af60801 this still stable. thanks!

dsalychev commented 1 year ago

Good to know :) Thanks for testing!

mcusim / freebsd-src

panic under heavy network load #19