ntop / PF_RING

High-speed packet processing framework
http://www.ntop.org
GNU Lesser General Public License v2.1
2.69k stars 349 forks source link

i40e_zc continuous attach/detach #88

Closed JeremyAshton-shopify closed 8 years ago

JeremyAshton-shopify commented 8 years ago

Now that we have implemented a work-around for https://github.com/ntop/PF_RING/issues/81 I am investigating the following errors:

Apr 14 15:02:28 <redacted> kernel: [65593.708040] [PF_RING-ZC] i40e_down: detach <isp4>
Apr 14 15:02:28 <redacted> kernel: [65593.708044] [PF_RING-ZC] i40e_down: detaching <isp4> while in use
Apr 14 15:02:28 <redacted> kernel: [65593.710229] i40e 0000:02:00.1: VEB bw config failed, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EINVAL
Apr 14 15:02:28 <redacted> kernel: [65593.710232] i40e 0000:02:00.1: Failed configuring TC for VEB seid=289
Apr 14 15:02:28 <redacted> kernel: [65593.711285] i40e 0000:02:00.1: AQ command Config VSI BW allocation per TC failed = 14
Apr 14 15:02:28 <redacted> kernel: [65593.711287] i40e 0000:02:00.1: Failed configuring TC map 255 for VSI 526
Apr 14 15:02:28 <redacted> kernel: [65593.711294] i40e 0000:02:00.1: Failed configuring TC for VSI seid=526
Apr 14 15:02:28 <redacted> kernel: [65593.713389] i40e 0000:02:00.1 <isp4>: adding 0c:c4:7a:bb:f7:65 vid=0
Apr 14 15:02:28 <redacted> kernel: [65593.715302] [PF_RING-ZC] i40e_up_complete: attach <isp4> [pf start=3279945728 len=8388608][cache_line_size=64][MSIX enabled]
Apr 14 15:02:28 <redacted> kernel: [65593.730512] [PF_RING-ZC] i40e_down: detach <isp4>
Apr 14 15:02:28 <redacted> kernel: [65593.730517] [PF_RING-ZC] i40e_down: detaching <isp4> while in use
Apr 14 15:02:28 <redacted> kernel: [65593.732754] i40e 0000:02:00.1 <isp4>: adding 0c:c4:7a:bb:f7:65 vid=0
Apr 14 15:02:28 <redacted> kernel: [65593.734335] [PF_RING-ZC] i40e_up_complete: attach <isp4> [pf start=3279945728 len=8388608][cache_line_size=64][MSIX enabled]
Apr 14 15:02:28 <redacted> kernel: [65594.024707] [PF_RING-ZC] i40e_down: detach <isp4>
Apr 14 15:02:28 <redacted> kernel: [65594.024711] [PF_RING-ZC] i40e_down: detaching <isp4> while in use
Apr 14 15:02:28 <redacted> kernel: [65594.026342] i40e 0000:02:00.1: VEB bw config failed, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EINVAL
Apr 14 15:02:28 <redacted> kernel: [65594.026347] i40e 0000:02:00.1: Failed configuring TC for VEB seid=289
Apr 14 15:02:28 <redacted> kernel: [65594.027388] i40e 0000:02:00.1: AQ command Config VSI BW allocation per TC failed = 14
Apr 14 15:02:28 <redacted> kernel: [65594.027391] i40e 0000:02:00.1: Failed configuring TC map 255 for VSI 526
Apr 14 15:02:28 <redacted> kernel: [65594.027393] i40e 0000:02:00.1: Failed configuring TC for VSI seid=526
Apr 14 15:02:28 <redacted> kernel: [65594.029484] i40e 0000:02:00.1 <isp4>: adding 0c:c4:7a:bb:f7:65 vid=0
Apr 14 15:02:28 <redacted> kernel: [65594.031304] [PF_RING-ZC] i40e_up_complete: attach <isp4> [pf start=3279945728 len=8388608][cache_line_size=64][MSIX enabled]
Apr 14 15:02:58 <redacted> kernel: [65623.714447] [PF_RING-ZC] i40e_down: detach <isp4>
Apr 14 15:02:58 <redacted> kernel: [65623.714452] [PF_RING-ZC] i40e_down: detaching <isp4> while in use
Apr 14 15:02:58 <redacted> kernel: [65623.715625] i40e 0000:02:00.1: VEB bw config failed, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EINVAL
Apr 14 15:02:58 <redacted> kernel: [65623.715629] i40e 0000:02:00.1: Failed configuring TC for VEB seid=289
Apr 14 15:02:58 <redacted> kernel: [65623.717595] i40e 0000:02:00.1: AQ command Config VSI BW allocation per TC failed = 14
Apr 14 15:02:58 <redacted> kernel: [65623.717600] i40e 0000:02:00.1: Failed configuring TC map 255 for VSI 526
Apr 14 15:02:58 <redacted> kernel: [65623.717602] i40e 0000:02:00.1: Failed configuring TC for VSI seid=526
Apr 14 15:02:58 <redacted> kernel: [65623.718746] i40e 0000:02:00.1 <isp4>: adding 0c:c4:7a:bb:f7:65 vid=0
Apr 14 15:02:58 <redacted> kernel: [65623.720511] [PF_RING-ZC] i40e_up_complete: attach <isp4> [pf start=3279945728 len=8388608][cache_line_size=64][MSIX enabled]
...
continued ad nauseam
...

dkms:

$ dkms status
e1000e-zc, 3.2.7.1.512, 3.19.0-58-generic, x86_64: installed
i40e-zc, 1.3.49.512, 3.19.0-58-generic, x86_64: installed
igb-zc, 5.3.3.5.512, 3.19.0-58-generic, x86_64: installed
ixgbe-zc, 4.1.5.512, 3.19.0-58-generic, x86_64: installed
pfring, 6.3.0, 3.19.0-58-generic, x86_64: installed
sysdig, 0.8.0, 3.19.0-58-generic, x86_64: installed
 $ cat /proc/net/pf_ring/info
PF_RING Version          : 6.3.0 (dev:83c088ff553c945e57bec1ec2aaf366a6539b0a6)
Total rings              : 24

Standard (non DNA/ZC) Options
Ring slots               : 4096
Slot version             : 16
Capture TX               : Yes [RX+TX]
IP Defragment            : No
Socket Mode              : Standard
Total plugins            : 0
Cluster Fragment Queue   : 0
Cluster Fragment Discard : 0
$ cat /sys/module/i40e_zc/*version
B778D1364EFA078B0FD4CB7
1.3.49
$ ethtool -i <isp4>
driver: i40e
version: 1.3.49
firmware-version: 4.42 0x8000191c 0.0.0
bus-info: 0000:02:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

I wonder if someone might offer insight as to next steps.

cardigliano commented 8 years ago

What did you do to produce this? Please provide your configuration/steps you followed

JeremyAshton-shopify commented 8 years ago
$ cat /etc/pf_ring/pf_ring.conf
enable_tx_capture=1
$ cat /etc/pf_ring/hugepages.conf
node=0 hugepagenumber=32768
$ cat /etc/pf_ring/zc/i40e/i40e.conf
RSS=1,1
description "zbalance_ipc packet capture fanout - <isp4>"

start on started pf_ring
stop on stopping pf_ring

respawn
respawn limit 1 10  # At most once per 10 seconds

pre-start script
  /sbin/ethtool -A <isp4> autoneg off rx off tx off &> /dev/null || true
  /sbin/ethtool -G <isp4> rx 4096 tx 4096 &> /dev/null || true
  /sbin/ethtool -K <isp4> tx off rx off sg off tso off gso off gro off rxvlan off txvlan off ntuple off rxhash off &> /dev/null || true
  /sbin/ethtool -L <isp4> combined 1 &> /dev/null || true
  /sbin/ip link set <isp4> up arp off multicast off mtu 9000 promisc on &> /dev/null || true
end script

script
  /usr/bin/chpst -u root:root /usr/local/bin/zbalance_ipc -i zc:<isp4> -c 40 -n 2 -m 2 -g 0 -p
end script
JeremyAshton-shopify commented 8 years ago

Now it is generating the following:

Apr 14 19:07:31 <redacted> kernel: [80301.916281] [PF_RING-ZC] i40e_down: detach <isp4>
Apr 14 19:07:31 <redacted> kernel: [80301.916285] [PF_RING-ZC] i40e_down: detaching <isp4> while in use
Apr 14 19:07:31 <redacted> kernel: [80301.918585] i40e 0000:02:00.1 <isp4>: adding 0c:c4:7a:bb:f7:65 vid=0
Apr 14 19:07:31 <redacted> kernel: [80301.920341] [PF_RING-ZC] i40e_up_complete: attach <isp4> [pf start=3279945728 len=8388608][cache_line_size=64][MSIX enabled]
Apr 14 19:07:32 <redacted> kernel: [80303.533973] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:15]
Apr 14 19:07:32 <redacted> kernel: [80303.542315] Modules linked in: 8021q garp mrp stp llc ipmi_devintf i40e_zc(OE) configfs pf_ring(OE) bonding ipmi_ssif intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ast ttm drm_kms_helper sb_edac joydev edac_core drm i2c_algo_bit syscopyarea sysfillrect sysimgblt lpc_ich mei_me ioatdma shpchp mei wmi ipmi_si 8250_fintek ipmi_msghandler acpi_power_meter mac_hid lp parport xfs libcrc32c ses enclosure raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor ixgbe hid_generic dca vxlan raid6_pq ip6_udp_tunnel udp_tunnel usbhid raid1 ahci ptp hid libahci raid0 megaraid_sas mdio pps_core multipath linear [last unloaded: i40e]
Apr 14 19:07:32 <redacted> kernel: [80303.542361] CPU: 1 PID: 15 Comm: migration/1 Tainted: G           OEL 3.19.0-58-generic #64~14.04.1-Ubuntu
Apr 14 19:07:32 <redacted> kernel: [80303.542362] Hardware name: Supermicro SSG-6048R-E1CR36H/X10DRH-iT, BIOS 1.1 05/15/2015
Apr 14 19:07:32 <redacted> kernel: [80303.542364] task: ffff883f665089d0 ti: ffff883f66514000 task.ti: ffff883f66514000
Apr 14 19:07:32 <redacted> kernel: [80303.542365] RIP: 0010:[<ffffffff8110edcc>]  [<ffffffff8110edcc>] multi_cpu_stop+0x5c/0xf0
Apr 14 19:07:32 <redacted> kernel: [80303.542369] RSP: 0000:ffff883f66517d68  EFLAGS: 00000246
Apr 14 19:07:32 <redacted> kernel: [80303.542370] RAX: ffffffff81815720 RBX: ffff882f4aa044b0 RCX: 000000000000250a
Apr 14 19:07:32 <redacted> kernel: [80303.542371] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff887f64957ab0
Apr 14 19:07:32 <redacted> kernel: [80303.542371] RBP: ffff883f66517d88 R08: ffff883f66514000 R09: 0000000000000001
Apr 14 19:07:32 <redacted> kernel: [80303.542372] R10: 00000000000027f8 R11: 0000000000000002 R12: ffffffff8109f390
Apr 14 19:07:32 <redacted> kernel: [80303.542373] R13: ffff883f66517ce8 R14: ffff883f7f913e80 R15: ffff883f7f913e80
Apr 14 19:07:32 <redacted> kernel: [80303.542374] FS:  0000000000000000(0000) GS:ffff883f7f880000(0000) knlGS:0000000000000000
Apr 14 19:07:32 <redacted> kernel: [80303.542375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 14 19:07:32 <redacted> kernel: [80303.542376] CR2: 00007f44b0a03f77 CR3: 0000000001c16000 CR4: 00000000001407e0
Apr 14 19:07:32 <redacted> kernel: [80303.542377] Stack:
Apr 14 19:07:32 <redacted> kernel: [80303.542378]  ffff887f64957ad8 ffff883f7f88e9e0 ffff883f7f88e9e8 ffff887f64957ab0
Apr 14 19:07:32 <redacted> kernel: [80303.542380]  ffff883f66517e58 ffffffff8110eff2 ffff883f66508fa8 ffffffff8181bac0
Apr 14 19:07:32 <redacted> kernel: [80303.542381]  ffff883f7f893e80 ffff883f7f893e80 0000000000000000 ffff883f636dbb80
Apr 14 19:07:32 <redacted> kernel: [80303.542383] Call Trace:
Apr 14 19:07:32 <redacted> kernel: [80303.542386]  [<ffffffff8110eff2>] cpu_stopper_thread+0x82/0x170
Apr 14 19:07:32 <redacted> kernel: [80303.542390]  [<ffffffff817b9e4b>] ? __schedule+0x37b/0x850
Apr 14 19:07:32 <redacted> kernel: [80303.542394]  [<ffffffff81098c13>] smpboot_thread_fn+0x143/0x1b0
Apr 14 19:07:32 <redacted> kernel: [80303.542396]  [<ffffffff81098ad0>] ? SyS_setgroups+0x180/0x180
Apr 14 19:07:32 <redacted> kernel: [80303.542399]  [<ffffffff81094db2>] kthread+0xd2/0xf0
Apr 14 19:07:32 <redacted> kernel: [80303.542402]  [<ffffffff81094ce0>] ? kthread_create_on_node+0x1c0/0x1c0
Apr 14 19:07:32 <redacted> kernel: [80303.542405]  [<ffffffff817be6d8>] ret_from_fork+0x58/0x90
Apr 14 19:07:32 <redacted> kernel: [80303.542407]  [<ffffffff81094ce0>] ? kthread_create_on_node+0x1c0/0x1c0
Apr 14 19:07:32 <redacted> kernel: [80303.542408] Code: 89 db 48 0f a3 18 19 db 85 db 41 0f 95 c5 31 f6 31 d2 eb 1c 0f 1f 00 83 fb 03 75 05 45 84 ed 75 66 f0 41 ff 4c 24 24 74 26 89 da <83> fa 04 74 3d f3 90 41 8b 5c 24 20 39 d3 74 f0 83 fb 02 75 d7
cardigliano commented 8 years ago

Not sure what is causing last trace, however I made some changes to the pf_ring kernel module to avoid playing with promisc when already configured via ip/ifconfig, to make sure this is not caused by the card reset when setting promisc on this card model. Please update (reloading pf_ring.ko), keep setting the promisc in your pre-start script, and let us know. Thank you.

JeremyAshton-shopify commented 8 years ago

This does not appear to have any effect.

Apr 18 19:22:19 <redacted> kernel: [  597.821564] [PF_RING-ZC] i40e_up_complete: attach <isp4> [pf start=3279945728 len=8388608][cache_line_size=64][MSIX enabled]
Apr 18 19:22:30 <redacted> kernel: [  608.897774] i40e 0000:04:00.1: ARQ: Update LLDP MIB event received
Apr 18 19:22:30 <redacted> kernel: [  608.897780] i40e 0000:04:00.1: LLDP event mib bridge type 0x0
Apr 18 19:22:30 <redacted> kernel: [  608.897782] i40e 0000:04:00.1: LLDP event mib type remote
Apr 18 19:22:30 <redacted> kernel: [  608.941081] i40e 0000:04:00.1: ARQ: Update LLDP MIB event received
Apr 18 19:22:30 <redacted> kernel: [  608.941085] i40e 0000:04:00.1: LLDP event mib bridge type 0x0
Apr 18 19:22:30 <redacted> kernel: [  608.941086] i40e 0000:04:00.1: LLDP event mib type local
Apr 18 19:22:30 <redacted> kernel: [  608.946128] i40e 0000:04:00.1: ETS UP2TC changed.
Apr 18 19:22:30 <redacted> kernel: [  608.946132] i40e 0000:04:00.1: ETS TC BW Table changed.
Apr 18 19:22:30 <redacted> kernel: [  608.946136] i40e 0000:04:00.1: ETS TSA Table changed.
Apr 18 19:22:30 <redacted> kernel: [  608.946137] i40e 0000:04:00.1: PFC config change detected.
Apr 18 19:22:30 <redacted> kernel: [  608.946139] i40e 0000:04:00.1: dcb need_reconfig=1
Apr 18 19:22:30 <redacted> kernel: [  608.948503] [PF_RING-ZC] i40e_down: detach <isp4>
Apr 18 19:22:30 <redacted> kernel: [  608.948504] [PF_RING-ZC] i40e_down: detaching <isp4> while in use
Apr 18 19:22:30 <redacted> kernel: [  608.950680] i40e 0000:04:00.1: VEB bw config failed, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_EACCES
Apr 18 19:22:30 <redacted> kernel: [  608.950682] i40e 0000:04:00.1: Failed configuring TC for VEB seid=290
Apr 18 19:22:30 <redacted> kernel: [  608.951749] i40e 0000:04:00.1: AQ command Config VSI BW allocation per TC failed = 14
Apr 18 19:22:30 <redacted> kernel: [  608.951752] i40e 0000:04:00.1: Failed configuring TC map 255 for VSI 526
Apr 18 19:22:30 <redacted> kernel: [  608.951754] i40e 0000:04:00.1: Failed configuring TC for VSI seid=526
Apr 18 19:22:30 <redacted> kernel: [  608.954025] i40e 0000:04:00.1: VSI seid 0 Tx ring 1 disable timeout
Apr 18 19:22:30 <redacted> kernel: [  608.954029] i40e 0000:04:00.1: ARQ: Update LLDP MIB event received
Apr 18 19:22:30 <redacted> kernel: [  608.954094] i40e 0000:04:00.1: LLDP event mib bridge type 0x4
Apr 18 19:22:30 <redacted> kernel: [  608.954098] i40e 0000:04:00.1: ARQ: Update LLDP MIB event received
Apr 18 19:22:30 <redacted> kernel: [  608.954100] i40e 0000:04:00.1: LLDP event mib bridge type 0x0
Apr 18 19:22:30 <redacted> kernel: [  608.954101] i40e 0000:04:00.1: LLDP event mib type local
Apr 18 19:22:30 <redacted> kernel: [  608.960153] i40e 0000:04:00.1: No change detected in DCBX configuration.
Apr 18 19:22:31 <redacted> kernel: [  609.641835] i40e 0000:04:00.1: PFR requested
Apr 18 19:22:31 <redacted> kernel: [  609.641840] i40e 0000:04:00.1: Tearing down internal switch for reset
Apr 18 19:22:31 <redacted> kernel: [  609.651602] i40e 0000:04:00.1: Rebuilding internal switch
Apr 18 19:22:31 <redacted> kernel: [  609.684028] i40e 0000:04:00.1: DCBX offload is supported for this PF.
Apr 18 19:22:31 <redacted> kernel: [  609.684031] i40e 0000:04:00.1: FCoE capability is disabled
Apr 18 19:22:31 <redacted> kernel: [  609.802483] i40e 0000:04:00.1: attempting to rebuild PF VSI
Apr 18 19:22:31 <redacted> kernel: [  609.806590] i40e 0000:04:00.1: AQ command Config VSI BW allocation per TC failed = 14
Apr 18 19:22:31 <redacted> kernel: [  609.806592] i40e 0000:04:00.1: Failed configuring TC map 255 for VSI 526
Apr 18 19:22:31 <redacted> kernel: [  609.806596] i40e 0000:04:00.1: failed to configure TCs for main VSI tc_map 0x000000ff, err I40E_ERR_INVALID_QP_ID aq_err I40E_AQ_RC_EINVAL
cardigliano commented 8 years ago

Questions:

  1. do you see the same errors using the Intel driver?
  2. were you able to reproduce the "soft lockup" trace since my last fix?
JeremyAshton-shopify commented 8 years ago

With the stock intel drivers I seem to keep receiving this message:

Apr 19 16:53:47 <redacted> kernel: [  759.321988] i40e 0000:02:00.1: veb bw config failed, aq_err=14
Apr 19 16:53:47 <redacted> kernel: [  759.321995] i40e 0000:02:00.1: Failed configuring TC for VEB seid=289
Apr 19 16:53:47 <redacted> kernel: [  759.324087] i40e 0000:02:00.1: AQ command Config VSI BW allocation per TC failed = 14
Apr 19 16:53:47 <redacted> kernel: [  759.324093] i40e 0000:02:00.1: Failed configuring TC map 255 for VSI 526
Apr 19 16:53:47 <redacted> kernel: [  759.324098] i40e 0000:02:00.1: Failed configuring TC for VSI seid=526
Apr 19 16:53:47 <redacted> kernel: [  759.326154] i40e 0000:02:00.1: AQ command Config VSI BW allocation per TC failed = 14
Apr 19 16:53:47 <redacted> kernel: [  759.326159] i40e 0000:02:00.1: Failed configuring TC map 1 for VSI 0
Apr 19 16:53:47 <redacted> kernel: [  759.326163] i40e 0000:02:00.1: Failed configuring TC for VSI seid=0
Apr 19 16:53:47 <redacted> kernel: [  759.328397] i40e 0000:02:00.1: i40e_vsi_wait_txq_disabled: VSI seid 0 Tx ring 64 disable timeout

I think it is time for me to abandon the i40e based chipset. There seems to be too much broken upstream.

jbrandeb commented 8 years ago

still looks like DCB/LLDP is enabled to me.

JeremyAshton-shopify commented 8 years ago

It is disabled on the switch. And, lldpd has been removed from the node.

JeremyAshton-shopify commented 8 years ago

Once every hour I see a LLDP broadcast but have no idea where it is coming from. The message originates from the node in question... but no idea which process.

JeremyAshton-shopify commented 8 years ago

It turns out there is an embedded lldp daemon (agent) inside the NIC firmware. It apparently can be disabled via echo "lldp stop" > /sys/kernel/debug/i40e/<bus_address>/command.

cardigliano commented 8 years ago

Please note we upgraded, not sure it works better for you, worth a try.

cardigliano commented 8 years ago

Closing this issue as the attach/detach issue should be solved.