luigirizzo / netmap

Automatically exported from code.google.com/p/netmap
BSD 2-Clause "Simplified" License
1.86k stars 537 forks source link

FreeBSD reboots when exiting pkt-gen using emulated adapters #722

Open brian90013 opened 4 years ago

brian90013 commented 4 years ago

Problem

I have seen the following issue on two different FreeBSD servers using three different NICs. I run two copies of pkt-gen on the same server; one as the transmitter and one as the receiver with the two ports connected by a loopback cable. Netmap is used in generic adapter mode (the only option for mlx5en, used for comparison testing with cxgbe and igb) and the receiver is configured for more than one queue.

When I stop the receiving pkt-gen using CTRL-C, I will frequently observe the machine immediately reboot. I have used sysctl debug.kdb.panic=1 (BEWARE, this will panic your system immediately) to verify my machines handle panics and write out a core image. I do not see the normal panic behavior in this case. Instead the machine immediately reboots. My one machine does print to /var/log/messages and the messages about the fault are below. I see the same problem using each of the three NICs listed.

I see the same problem using my own netmap application, but have used pkt-gen here for ease of reproducing the problem. Therefore I don't think it is a problem with pkt-gen but perhaps the underlying generic adapter code? I am hoping to use multiple queues in generic adapter mode to benefit from NICs with hardware queues that don't have native netmap support. Thank you for your help!

Procedure

# Force emulated mode (for igb, automatic for cc and mce)
sysctl dev.netmap.admode=2

# Create 4 rings for emulated devices
sysctl dev.netmap.generic_rings=4

# Create transmitter
pkt-gen -i igb0 -f tx

# Separate window, create receiver
pkt-gen -i igb1 -f rx -c 4 -p 4

Process output

Transmitter output

./pkt-gen -i igb0 -f tx 
395.838050 main [2889] interface is igb0
395.838063 main [3011] using default burst size: 512
395.838066 main [3019] running on 1 cpus (have 8)
395.838116 extract_ip_range [471] range is 10.0.0.1:1234 to 10.0.0.1:1234
395.838120 extract_ip_range [471] range is 10.1.0.1:1234 to 10.1.0.1:1234
395.838134 nm_open [858] overriding ARG1 0
395.838136 nm_open [862] overriding ARG2 0
395.838137 nm_open [866] overriding ARG3 0
395.838138 nm_open [870] overriding RING_CFG
395.838139 nm_open [879] overriding ifname igb0 ringid 0x0 flags 0x8001
396.020191 main [3117] mapped 334980KB at 0x800e00000
Sending on netmap:igb0: 4 queues, 1 threads and 1 cpus.
10.0.0.1 -> 10.1.0.1 (00:00:00:00:00:00 -> ff:ff:ff:ff:ff:ff)
396.020211 main [3224] Sending 512 packets every  0.000000000 s
396.020241 start_threads [2549] Wait 2 secs for phy reset
398.146182 start_threads [2551] Ready...
398.146213 sender_body [1580] start, fd 3 main_fd 3
398.146235 sender_body [1638] frags 1 frag_size 60
398.212350 sender_body [1676] drop copy
399.147959 main_thread [2639] 1.490 Mpps (1.492 Mpkts 716.365 Mbps in 1001747 usec) 39.19 avg_batch 0 min_space
400.149959 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.744 Mbps in 1001999 usec) 38.94 avg_batch 99999 min_space
401.151959 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.764 Mbps in 1002001 usec) 38.82 avg_batch 99999 min_space
402.153959 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.736 Mbps in 1002000 usec) 38.79 avg_batch 99999 min_space
403.155872 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.727 Mbps in 1001912 usec) 38.76 avg_batch 99999 min_space
404.156959 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.099 Mbps in 1001087 usec) 38.70 avg_batch 99999 min_space
405.157958 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.038 Mbps in 1001000 usec) 38.68 avg_batch 99999 min_space
406.179959 main_thread [2639] 1.488 Mpps (1.521 Mpkts 730.034 Mbps in 1022001 usec) 38.48 avg_batch 99999 min_space
407.180958 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.016 Mbps in 1000999 usec) 38.72 avg_batch 99999 min_space
408.182958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.787 Mbps in 1002000 usec) 38.58 avg_batch 99999 min_space
409.184958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.729 Mbps in 1002000 usec) 38.62 avg_batch 99999 min_space
410.185958 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.077 Mbps in 1001000 usec) 38.65 avg_batch 99999 min_space
411.187368 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.319 Mbps in 1001410 usec) 38.58 avg_batch 99999 min_space
412.188958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.451 Mbps in 1001590 usec) 38.66 avg_batch 99999 min_space
413.189958 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.020 Mbps in 1001000 usec) 40.74 avg_batch 99999 min_space
414.191964 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.791 Mbps in 1002006 usec) 40.18 avg_batch 99999 min_space
415.194959 main_thread [2639] 1.488 Mpps (1.493 Mpkts 716.438 Mbps in 1002995 usec) 39.80 avg_batch 99999 min_space
416.195958 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.073 Mbps in 1000999 usec) 39.58 avg_batch 99999 min_space
417.197958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.736 Mbps in 1002000 usec) 39.68 avg_batch 99999 min_space
418.199958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.756 Mbps in 1002000 usec) 39.64 avg_batch 99999 min_space
419.200958 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.060 Mbps in 1000999 usec) 39.66 avg_batch 99999 min_space
420.202958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.751 Mbps in 1002000 usec) 39.70 avg_batch 99999 min_space
421.203957 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.024 Mbps in 1001000 usec) 41.81 avg_batch 99999 min_space
422.205958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.774 Mbps in 1002000 usec) 36.84 avg_batch 99999 min_space
423.269461 main_thread [2639] 1.488 Mpps (1.583 Mpkts 759.683 Mbps in 1063504 usec) 37.85 avg_batch 99999 min_space
424.320960 main_thread [2639] 1.488 Mpps (1.565 Mpkts 751.104 Mbps in 1051499 usec) 37.31 avg_batch 99999 min_space
425.322958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.748 Mbps in 1001998 usec) 37.23 avg_batch 99999 min_space
426.324957 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.751 Mbps in 1001999 usec) 37.39 avg_batch 99999 min_space
427.325957 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.044 Mbps in 1001000 usec) 37.34 avg_batch 99999 min_space
428.327957 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.757 Mbps in 1002000 usec) 37.30 avg_batch 99999 min_space
429.328957 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.044 Mbps in 1001000 usec) 37.26 avg_batch 99999 min_space
430.329957 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.029 Mbps in 1001000 usec) 37.36 avg_batch 99999 min_space
431.373057 main_thread [2639] 1.488 Mpps (1.552 Mpkts 745.133 Mbps in 1043100 usec) 37.31 avg_batch 99999 min_space
432.373957 main_thread [2639] 1.488 Mpps (1.490 Mpkts 714.983 Mbps in 1000899 usec) 37.30 avg_batch 99999 min_space
433.375957 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.742 Mbps in 1002000 usec) 37.47 avg_batch 99999 min_space
Connection to m1 closed by remote host.

Receiver

pkt-gen -i igb1 -f rx -c 4 -p 4
422.834488 main [2889] interface is igb1
422.834512 main [3011] using default burst size: 512
422.834515 main [3019] running on 4 cpus (have 8)
422.834569 extract_ip_range [471] range is 10.0.0.1:1234 to 10.0.0.1:1234
422.834572 extract_ip_range [471] range is 10.1.0.1:1234 to 10.1.0.1:1234
422.834586 nm_open [858] overriding ARG1 0
422.834588 nm_open [862] overriding ARG2 0
422.834588 nm_open [866] overriding ARG3 0
422.834589 nm_open [870] overriding RING_CFG
422.834590 nm_open [879] overriding ifname igb1 ringid 0x0 flags 0x8004
423.016613 main [3117] mapped 334980KB at 0x800e00000
Receiving from netmap:igb1: 4 queues, 4 threads and 4 cpus.
423.016715 nm_open [879] overriding ifname igb1 ringid 0x1 flags 0x8004
423.016770 nm_mmap [989] do not mmap, inherit from parent
423.016816 nm_open [879] overriding ifname igb1 ringid 0x2 flags 0x8004
423.016872 nm_mmap [989] do not mmap, inherit from parent
423.016911 nm_open [879] overriding ifname igb1 ringid 0x3 flags 0x8004
423.016968 nm_mmap [989] do not mmap, inherit from parent
423.016975 start_threads [2549] Wait 2 secs for phy reset
425.017957 start_threads [2551] Ready...
425.018001 receiver_body [1807] reading from netmap:igb1 fd 3 main_fd 3
425.018014 receiver_body [1807] reading from netmap:igb1 fd 4 main_fd 3
425.018031 receiver_body [1807] reading from netmap:igb1 fd 5 main_fd 3
425.018069 receiver_body [1807] reading from netmap:igb1 fd 6 main_fd 3
426.018957 receiver_body [1822] waiting for initial packets, poll returns 0 0
426.019957 main_thread [2639] 1.489 Mpps (1.492 Mpkts 716.217 Mbps in 1001890 usec) 9.00 avg_batch 0 min_space
427.020957 receiver_body [1822] waiting for initial packets, poll returns 0 0
427.021957 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.776 Mbps in 1002000 usec) 9.02 avg_batch 300962 min_space
428.021957 receiver_body [1822] waiting for initial packets, poll returns 0 0
428.022957 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.018 Mbps in 1000999 usec) 9.02 avg_batch 300997 min_space
429.022957 receiver_body [1822] waiting for initial packets, poll returns 0 0
429.023957 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.056 Mbps in 1001001 usec) 9.03 avg_batch 300995 min_space
430.023957 receiver_body [1822] waiting for initial packets, poll returns 0 0
430.024957 main_thread [2639] 1.488 Mpps (1.490 Mpkts 715.033 Mbps in 1001000 usec) 9.04 avg_batch 300989 min_space
431.025957 receiver_body [1822] waiting for initial packets, poll returns 0 0
431.026958 main_thread [2639] 1.488 Mpps (1.491 Mpkts 715.752 Mbps in 1002000 usec) 9.02 avg_batch 300958 min_space
^C431.842178 sigint_h [562] received control-C on thread 0x800721000
431.842187 main_thread [2639] 1.488 Mpps (1.213 Mpkts 582.373 Mbps in 815230 usec) 9.04 avg_batch 300960 min_space
432.050957 receiver_body [1822] waiting for initial packets, poll returns 0 0
432.863957 main_thread [2639] 16.000 pps (16.000 pkts 7.680 Kbps in 1021770 usec) 16.00 avg_batch 1008 min_space
Received 10156735 packets 609404100 bytes 1125343 events 60 bytes each in 0.00 seconds.
Speed: 10.157 Tpps Bandwidth: 4875.233 Tbps (raw 4875.233 Tbps). Average batch: 9.03 pkts
Connection to m1 closed by remote host.

/var/log/messages

Sep 10 10:16:36 m1 kernel: 395.838144 [1130] generic_netmap_attach     Emulated adapter for igb0 created (prev was igb0)
Sep 10 10:16:36 m1 kernel: 395.936108 [ 320] generic_netmap_register   Emulated adapter for igb0 activated
Sep 10 10:16:52 m1 kernel: 412.168500 [1130] generic_netmap_attach     Emulated adapter for igb1 created (prev was igb1)
Sep 10 10:16:52 m1 kernel: 412.266406 [ 320] generic_netmap_register   Emulated adapter for igb1 activated
Sep 10 10:17:00 m1 kernel: 420.850072 [ 295] generic_netmap_unregister Emulated adapter for igb1 deactivated
Sep 10 10:17:00 m1 kernel: 420.935376 [1027] generic_netmap_dtor       Native netmap adapter for igb1 restored
Sep 10 10:17:01 m1 kernel: 421.023725 [1035] generic_netmap_dtor       Emulated netmap adapter for igb1 destroyed
Sep 10 10:17:03 m1 kernel: 422.834594 [1130] generic_netmap_attach     Emulated adapter for igb1 created (prev was igb1)
Sep 10 10:17:03 m1 kernel: 422.932437 [ 320] generic_netmap_register   Emulated adapter for igb1 activated
Sep 10 10:17:12 m1 kernel: 
Sep 10 10:17:12 m1 kernel: 432.864074 [ 295] generic_netmap_unregister Emulated adapter for igb1 deactivated
Sep 10 10:17:12 m1 kernel: 
Sep 10 10:17:12 m1 kernel: Fatal trap 12: page fault while in kernel mode
Sep 10 10:17:12 m1 kernel: cpuid = 2; 
Sep 10 10:17:13 m1 kernel: 433.003472 [1027] generic_netmap_dtor       Native netmap adapter for igb1 restored
Sep 10 10:17:13 m1 kernel: apic id = 02
Sep 10 10:17:13 m1 kernel: fault virtual address    = 0x0
Sep 10 10:17:13 m1 kernel: fault code       = supervisor read instruction, page not present
Sep 10 10:17:13 m1 kernel: 433.147976 [1035] generic_netmap_dtor       Emulated netmap adapter for igb1 destroyed
Sep 10 10:17:13 m1 kernel: instruction pointer  = 0x20:0x0
Sep 10 10:17:13 m1 kernel: stack pointer            = 0x28:0xfffffe0075dcf8b8
Sep 10 10:17:13 m1 kernel: frame pointer            = 0x28:0xfffffe0075dcf9a0
Sep 10 10:17:13 m1 kernel: code segment     = base 0x0, limit 0xfffff, type 0x1b
Sep 10 10:17:13 m1 kernel:          = DPL 0, pres 1, long 1, def32 0, gran 1
Sep 10 10:17:13 m1 kernel: processor eflags = interrupt enabled, 
Sep 10 10:17:13 m1 kernel: resume, IOPL = 0
Sep 10 10:17:13 m1 kernel: current process      = 0 (if_io_tqg_2)
Sep 10 10:17:13 m1 kernel: trap number      = 12
brian90013 commented 4 years ago

Last week I upgraded one of my machines to follow FreeBSD-12.2-BETA1. This morning I observed the same reboot behavior with the cxgbe driver in emulated mode on 12.2.

uname
FreeBSD tr 12.2-BETA1 FreeBSD 12.2-BETA1 r365761 GENERIC  amd64
brian90013 commented 4 years ago

I have updated the test machine to run FreeBSD tr 12.2-STABLE FreeBSD 12.2-STABLE #2 r366977M. I am still seeing this issue with multiple drivers, but with INVARIANTS enabled I have a bit more output to report.

igb driver

Output is above, fatal trap 12 with instruction pointer set to 0. It appears the process was a worker thread for the network driver.

cxgbe driver

Same fatal trap, same fault code, same instruction pointer. This time the debugger reports thread irq308: t6nex0: 1a4 as the culprit. The first cxgbe function in the stack trace is t4_vi_intr() which is an interrupt handler "for vectors shared between NIC and netmap rx queues". Here we are in emulated mode so the netmap-specific code in the driver should not be running.

mlx5en driver

Same trap, fault, instruction pointer. Thread irq353: mlx5_core1 is running and the stack trace ends with mlx5e_rx_cq_comp().

Summary

So I see the same behavior with all three NICs - the common code is the emulated netmap adapter. The problem is seen only when the receiver has multiple threads running. My guess is as pkt-gen is shutting down, the netmap driver has removed itself from some of the queues when data comes in other queues and the NIC attempts to call a netmap handler that has been removed, causing the fault? That would explain why it only occurs when receiving with multiple threads and why the stack traces all point to NIC IRQ code, not anything in the netmap driver.