ntop / PF_RING

High-speed packet processing framework
http://www.ntop.org
GNU Lesser General Public License v2.1
2.66k stars 353 forks source link

Bad page state in process swapper/0 pfn:325392d #707

Closed ecloudzbox closed 2 years ago

ecloudzbox commented 3 years ago

CentOS 8.0 kernel 5.4.86-1.el8.elrepo.x86_64 pfring 7.9.0 (make install)

Use PKTGEN to send the package on the other end, and the system will die in about an hour

error log:

[Tue Apr 20 18:05:30 2021] BUG: Bad page state in process swapper/0 pfn:325392c [Tue Apr 20 18:05:30 2021] page:ffffeb49c94e4b00 refcount:65533 mapcount:0 mapping:0000000000000000 index:0x0 [Tue Apr 20 18:05:30 2021] flags: 0x17ffffc0000000() [Tue Apr 20 18:05:30 2021] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000 [Tue Apr 20 18:05:30 2021] raw: 0000000000000000 0000000000000000 0000fffdffffffff 0000000000000000 [Tue Apr 20 18:05:30 2021] page dumped because: nonzero _refcount [Tue Apr 20 18:05:30 2021] Modules linked in: veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nft_counter xt_addrtype nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink br_netfilter bridge stp llc overlay fuse ffs(O) i40e(O) pf_ring(O) vxlan ip6_udp_tunnel udp_tunnel rfkill ext4 mbcache jbd2 intel_rapl_msr intel_rapl_common sb_edac ipmi_ssif iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate mei_me intel_uncore pcspkr i2c_i801 mei joydev lpc_ich sg ioatdma ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter ip_tables xfs libcrc32c hid_logitech_hidpp uas usb_storage hid_logitech_dj sr_mod cdrom sd_mod ast drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm igb ahci libahci libata dca crc32c_intel i2c_algo_bit wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: pf_ring] [Tue Apr 20 18:05:30 2021] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B O 5.4.86-1.el8.elrepo.x86_64 #1 [Tue Apr 20 18:05:30 2021] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 3.1 06/06/2018 [Tue Apr 20 18:05:30 2021] Call Trace: [Tue Apr 20 18:05:30 2021] [Tue Apr 20 18:05:30 2021] dump_stack+0x66/0x90 [Tue Apr 20 18:05:30 2021] bad_page.cold.124+0x7f/0xb2 [Tue Apr 20 18:05:30 2021] free_pcppages_bulk+0x182/0x670 [Tue Apr 20 18:05:30 2021] free_unref_page+0x54/0x70 [Tue Apr 20 18:05:30 2021] ? udp4_lib_rcv+0x8ad/0xae0 [Tue Apr 20 18:05:30 2021] kfree_skb+0x32/0xa0 [Tue Apr 20 18:05:30 2021] __udp4_lib_rcv+0x8ad/0xae0 [Tue Apr 20 18:05:30 2021] ip_protocol_deliver_rcu+0xbe/0x1a0 [Tue Apr 20 18:05:30 2021] ip_local_deliver_finish+0x44/0x50 [Tue Apr 20 18:05:30 2021] ip_local_deliver+0xe0/0xf0 [Tue Apr 20 18:05:30 2021] ? ip_protocol_deliver_rcu+0x1a0/0x1a0 [Tue Apr 20 18:05:30 2021] ip_sublist_rcv_finish+0x3d/0x50 [Tue Apr 20 18:05:30 2021] ip_sublist_rcv+0x1d1/0x280 [Tue Apr 20 18:05:30 2021] ? ip_rcv_finish_core.isra.22+0x380/0x380 [Tue Apr 20 18:05:30 2021] ip_list_rcv+0x105/0x129 [Tue Apr 20 18:05:30 2021] netif_receive_skb_list_core+0x256/0x280 [Tue Apr 20 18:05:30 2021] netif_receive_skb_list_internal+0x192/0x2a0 [Tue Apr 20 18:05:30 2021] gro_normal_list.part.143+0x19/0x40 [Tue Apr 20 18:05:30 2021] napi_complete_done+0x83/0x110 [Tue Apr 20 18:05:30 2021] i40e_napi_poll+0x5cc/0x8b0 [i40e] [Tue Apr 20 18:05:30 2021] ? net_rx_action+0x13b/0x380 [Tue Apr 20 18:05:30 2021] ? __do_softirq+0xe4/0x2f8 [Tue Apr 20 18:05:30 2021] ? irq_exit+0xe9/0xf0 [Tue Apr 20 18:05:30 2021] ? do_IRQ+0x53/0xe0 [Tue Apr 20 18:05:30 2021] ? common_interrupt+0xf/0xf [Tue Apr 20 18:05:30 2021] [Tue Apr 20 18:05:30 2021] ? cpuidle_enter_state+0xbc/0x450 [Tue Apr 20 18:05:30 2021] ? cpuidle_enter+0x29/0x40 [Tue Apr 20 18:05:30 2021] ? do_idle+0x228/0x270 [Tue Apr 20 18:05:30 2021] ? cpu_startup_entry+0x19/0x20 [Tue Apr 20 18:05:30 2021] ? start_kernel+0x534/0x555 [Tue Apr 20 18:05:30 2021] ? secondary_startup_64+0xb6/0xc0

cardigliano commented 3 years ago

@ecloudzbox could you provide more information about:

  1. pf_ring kernel module parameters
  2. loaded drivers (and parameters if you are using ZC drivers)
  3. application (and configuration) you are running Thank you
ecloudzbox commented 3 years ago

@cardigliano

  1. install module command: insmod ./pf_ring.ko min_num_slots=65536 enable_tx_capture=0 module info:

[root@localhost kernel]# modinfo pf_ring filename: /lib/modules/5.4.86-1.el8.elrepo.x86_64/extra/pf_ring.ko.xz alias: net-pf-27 version: 7.9.0 description: Packet capture acceleration and analysis author: ntop.org license: GPL srcversion: B7582AF3D744F71CA021D47 depends:
retpoline: Y name: pf_ring vermagic: 5.4.86-1.el8.elrepo.x86_64 SMP mod_unload modversions parm: min_num_slots:Min number of ring slots (uint) parm: perfect_rules_hash_size:Perfect rules hash size (uint) parm: enable_tx_capture:Set to 1 to capture outgoing packets (uint) parm: enable_frag_coherence:Set to 1 to handle fragments (flow coherence) in clusters (uint) parm: enable_ip_defrag:Set to 1 to enable IP defragmentation(only rx traffic is defragmentead) (uint) parm: quick_mode:Set to 1 to run at full speed but with upto one socket per interface (uint) parm: force_ring_lock:Set to 1 to force ring locking (automatically enable with rss) (uint) parm: enable_debug:Set to 1 to enable PF_RING debug tracing into the syslog, 2 for more verbosity (uint) parm: transparent_mode:(deprecated) (uint)

  1. loaded zc module

[root@localhost src]# pwd /root/PF_RING/drivers/intel/i40e/i40e-2.13.10-zc/src

[root@localhost src]# ./load_driver.sh Configuring ens2 IFACE CORE MASK -> FILE

ens2 0 1 -> /proc/irq/50/smp_affinity Configuring ens7f2 IFACE CORE MASK -> FILE

ens7f2 0 1 -> /proc/irq/195/smp_affinity Configuring ens7f0 IFACE CORE MASK -> FILE

ens7f0 0 1 -> /proc/irq/103/smp_affinity Configuring ens7f3 IFACE CORE MASK -> FILE

ens7f3 0 1 -> /proc/irq/241/smp_affinity Configuring ens7f1 IFACE CORE MASK -> FILE

ens7f1 0 1 -> /proc/irq/149/smp_affinity`

i40e module info:

[root@localhost ~]# modinfo i40e filename: /lib/modules/5.4.86-1.el8.elrepo.x86_64/kernel/drivers/net/ethernet/intel/i40e/i40e.ko.xz version: 2.8.20-k license: GPL v2 description: Intel(R) Ethernet Connection XL710 Network Driver author: Intel Corporation, e1000-devel@lists.sourceforge.net srcversion: B9A1D27F86384157A250744 alias: pci:v00008086d0000158Bsvsdbcsci alias: pci:v00008086d0000158Asvsdbcsci alias: pci:v00008086d00000D58svsdbcsci alias: pci:v00008086d00000CF8svsdbcsci alias: pci:v00008086d00001588svsdbcsci alias: pci:v00008086d00001587svsdbcsci alias: pci:v00008086d000037D3svsdbcsci alias: pci:v00008086d000037D2svsdbcsci alias: pci:v00008086d000037D1svsdbcsci alias: pci:v00008086d000037D0svsdbcsci alias: pci:v00008086d000037CFsvsdbcsci alias: pci:v00008086d000037CEsvsdbcsci alias: pci:v00008086d0000104Fsvsdbcsci alias: pci:v00008086d0000104Esvsdbcsci alias: pci:v00008086d000015FFsvsdbcsci alias: pci:v00008086d00001589svsdbcsci alias: pci:v00008086d00001586svsdbcsci alias: pci:v00008086d00001585svsdbcsci alias: pci:v00008086d00001584svsdbcsci alias: pci:v00008086d00001583svsdbcsci alias: pci:v00008086d00001581svsdbcsci alias: pci:v00008086d00001580svsdbcsci alias: pci:v00008086d00001574svsdbcsci alias: pci:v00008086d00001572svsdbcsci depends:
retpoline: Y intree: Y name: i40e vermagic: 5.4.86-1.el8.elrepo.x86_64 SMP mod_unload modversions parm: debug:Debug level (0=none,...,16=all), Debug mask (0x8XXXXXXX) (uint)

  1. use pktgen tool to PF_RING ZC node, there will be a bug in about 20 minutes,and system down/unavailable in about an hour . the script is

modprobe pktgen function pgset() { local result echo $1 > $PGDEV result=cat $PGDEV | fgrep "Result: OK:" if [ "$result" = "" ]; then cat $PGDEV | fgrep Result: fi } PGDEV=/proc/net/pktgen/kpktgend_0 pgset "rem_device_all"
pgset "add_device p3p4"
PGDEV=/proc/net/pktgen/p3p4 pgset "count 0"
pgset "delay 0"
pgset "clone_skb 0"
pgset "pkt_size 1500"
pgset "dst 10.0.0.171" # pgset "dst_mac 68:91:d0:66:2a:ea"
pgset "src_mac 68:91:d0:66:80:69" PGDEV=/proc/net/pktgen/pgctrl pgset "start"

Thank you !!!

sippejw commented 2 years ago

Is there any update on this? I am seeing the same thing on Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-90-generic x86_64). Building 8.0.0-stable from source and using zbalance_ipc is producing the same error as above.

cardigliano commented 2 years ago

@sippejw since we are not able to reproduce (and thus debug) this, could you provide detailed instructions to reproduce it or even better access to the system? Thank you.

jmwample commented 2 years ago

To follow up here. We have experienced this issue on ubuntu 18.04 and 20.04 specifically for network cards using the intel XL710 qsfp+ network driver. It seems to be caused by the i40e driver and happens only for 7.8.0-stable and later (including the dev branch as of 5cc19525b97142e0147fdb930b59f84770fb4e51) -- drivers/intel/i40e/i40e-2.4.6-zc doesn't seem to have this issue.

Hardware

$ lspci | egrep -i --color 'network|ethernet'
1b:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
1b:00.1 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
b3:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
b3:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)

Loaded drivers and parameters

Since we are installing from source we are only installing the kernel module and the i40e zc driver module.

$ insmod pf_ring.ko min_num_slots=65536
$ modinfo pf_ring
filename:       /lib/modules/5.4.0-91-generic/kernel/net/pf_ring/pf_ring.ko
alias:          net-pf-27
version:        8.1.0
description:    Packet capture acceleration and analysis
author:         ntop.org
license:        GPL
srcversion:     49928A30A20087E50DEE717
depends:        
retpoline:      Y
name:           pf_ring
vermagic:       5.4.0-91-generic SMP mod_unload modversions 
parm:           min_num_slots:Min number of ring slots (uint)
parm:           perfect_rules_hash_size:Perfect rules hash size (uint)
parm:           enable_tx_capture:Set to 1 to capture outgoing packets (uint)
parm:           enable_frag_coherence:Set to 1 to handle fragments (flow coherence) in clusters (uint)
parm:           enable_ip_defrag:Set to 1 to enable IP defragmentation(only rx traffic is defragmentead) (uint)
parm:           quick_mode:Set to 1 to run at full speed but with upto one socket per interface (uint)
parm:           force_ring_lock:Set to 1 to force ring locking (automatically enable with rss) (uint)
parm:           enable_debug:Set to 1 to enable PF_RING debug tracing into the syslog, 2 for more verbosity (uint)
parm:           transparent_mode:(deprecated) (uint)

Application

the error occurs when a device attempts to use the i40e driver after it is inserted.

# watch dmesg for BUGs
sudo dmesg -wH

## Separate terminal
# insert module
cd PF_RING/drivers/intel/i40e/i40e-2.4.6-zc/src/
sudo ./load_driver.sh

sudo zcount -i zc:ens2f0

dmesg should show a cascade of page errors - here it blames sshd because our management interfaces also use i40e which start using the i40e driver as soon as it is inserted.

[  +0.000001] BUG: Bad page state in process sshd  pfn:bf5668
[  +0.000314] page:ffffb8cfafd59a00 refcount:65533 mapcount:0 mapping:0000000000000000 index:0x0
[  +0.000000] flags: 0x17ffffc0000000()
[  +0.000001] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
[  +0.000001] raw: 0000000000000000 0000000000000000 0000fffdffffffff 0000000000000000
[  +0.000000] page dumped because: nonzero _refcount
[  +0.000000] Modules linked in: i40e(OE) pf_ring(OE) ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs cpuid vxlan ip6_udp_tunnel udp_tunnel bonding dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif intel_rapl_msr intel_rapl_common isst_if_common skx_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm rapl intel_cstate mei_me mei joydev input_leds ioatdma dca ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid sch_fq_codel msr ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper crct10dif_pclmul syscopyarea sysfillrect crc32_pclmul sysimgblt ghash_clmulni_intel hid_generic fb_sys_fops aesni_intel crypto_simd usbhid cryptd drm hid glue_helper i2c_i801 lpc_ich ahci libahci wmi [last unloaded: pf_ring]
[  +0.000041] CPU: 6 PID: 31998 Comm: sshd Tainted: G    B      OE     5.4.0-90-generic #101-Ubuntu
[  +0.000001] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 2.2 11/01/2018
[  +0.000000] Call Trace:
[  +0.000009]  dump_stack+0x6d/0x8b
[  +0.000001]  bad_page.cold+0x80/0xb1
[  +0.000001]  check_new_page_bad+0x67/0x80
[  +0.000002]  rmqueue+0x72e/0xf00
[  +0.000007]  get_page_from_freelist+0xb8/0x3f0
[  +0.000001]  __alloc_pages_nodemask+0x173/0x320
[  +0.000001]  alloc_pages_current+0x87/0xe0
[  +0.000002]  skb_page_frag_refill+0x80/0x110
[  +0.000007]  sk_page_frag_refill+0x21/0x80
[  +0.000001]  tcp_sendmsg_locked+0x2c9/0xde0
[  +0.000002]  tcp_sendmsg+0x2d/0x50
[  +0.000007]  inet_sendmsg+0x43/0x70
[  +0.000001]  sock_sendmsg+0x5e/0x70
[  +0.000002]  sock_write_iter+0x93/0xf0
[  +0.000008]  new_sync_write+0x125/0x1c0
[  +0.000001]  __vfs_write+0x29/0x40
[  +0.000002]  vfs_write+0xb9/0x1a0
[  +0.000000]  ksys_write+0x67/0xe0
[  +0.000008]  __x64_sys_write+0x1a/0x20
[  +0.000001]  do_syscall_64+0x57/0x190
[  +0.000001]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  +0.000000] RIP: 0033:0x7f1f4332a1e7
[  +0.000001] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[  +0.000001] RSP: 002b:00007ffc78e24a28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  +0.000007] RAX: ffffffffffffffda RBX: 00000000000002f4 RCX: 00007f1f4332a1e7
[  +0.000000] RDX: 00000000000002f4 RSI: 00005575f1da8ab0 RDI: 0000000000000004
[  +0.000001] RBP: 00005575f1daf9c0 R08: 00007ffc78fc70f0 R09: 00007ffc78e249b8
[  +0.000000] R10: 00007ffc78e249b0 R11: 0000000000000246 R12: 0000000000000000
[  +0.000001] R13: 00005575f064a868 R14: 0000000000000004 R15: 0000000000000004
cardigliano commented 2 years ago

This may be related to https://github.com/ntop/PF_RING/issues/774

jmwample commented 2 years ago

The solution to #774 has fixed this issue for us and we are not longer seeing page errors.

cardigliano commented 2 years ago

Thank you for the update. Let's close this.