Open mohammadkhanjani opened 2 years ago
The dmesg call trace is incomplete and does not show what the problem is (e.g. NULL pointer reference, deadlock, etc.)
Hi thanks for checking my problem. please help me. it's an emergency and a very big bug. I check the older netmap with the ixgbe driver 5.5.3 and it was ok. after that I check again with the newest netmap with ixgbe 5.5.3 and 5.15.2 but the problem still exists.
complete dmesg:
[500932.091311] watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [C eth4 #0:10872]
[500932.091356] Modules linked in: ixgbe(OE) i40e(OE) auxiliary(OE) netmap(OE) dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_gss_krb5 veth iptable_f
ilter ip_tables x_tables bridge stp llc ipmi_ssif intel_rapl sb_edac nls_iso8859_1 x86_pkg_temp_thermal intel_powerclamp hpilo coretemp kvm irqbypass intel_cstate intel_rapl_perf shpchp ioat
dma lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter mac_hid nfsd auth_rpcgss nfs_acl lockd grace sunrpc autofs4 algif_skcipher af_alg dm_crypt raid10 raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_he
lper cryptd mgag200 i2c_algo_bit ttm drm_kms_helper
[500932.091401] syscopyarea sysfillrect sysimgblt fb_sys_fops ahci drm libahci tg3 dca hpsa ptp pps_core scsi_transport_sas wmi [last unloaded: auxiliary]
[500932.091411] CPU: 4 PID: 10872 Comm: C eth4 #0 Tainted: G OE 4.15.25-47 #1
[500932.091425] RIP: 0010:ixgbe_netmap_rxsync+0x182/0x440 [ixgbe]
[500932.091426] RSP: 0018:ffffb013cdf9f988 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
[500932.091427] RAX: 00000000aa5c7e91 RBX: 0000000000000e91 RCX: ffff9d9cffb6ea10
[500932.091428] RDX: 0000000000000564 RSI: 0000000000000000 RDI: 0000002819226c00
[500932.091429] RBP: ffffb013cdf9f9e8 R08: 0000000000000fff R09: 0000000000000e91
[500932.091429] R10: ffffb013ebe8d000 R11: 0000000000000000 R12: 0000000000000002
[500932.091430] R13: ffff9d847e0a8a50 R14: 0000000000000e91 R15: ffff9d9cffb60000
[500932.091431] FS: 00007fff6d810700(0000) GS:ffff9d9ebf900000(0000) knlGS:0000000000000000
[500932.091432] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[500932.091433] CR2: 00007ffff712c760 CR3: 0000003f6c18d003 CR4: 00000000003606e0
[500932.091434] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[500932.091434] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[500932.091435] Call Trace:
[500932.091448] netmap_poll+0x562/0x5d0 [netmap]
[500932.091452] ? _slab_alloc+0x27d/0x4e0
[500932.091457] linux_netmap_poll+0x3f/0x60 [netmap]
[500932.091460] do_sys_poll+0x364/0x600
[500932.091464] ? notify_change+0x390/0x410
[500932.091468] ? _cond_resched+0x1a/0x50
[500932.091470] ? poll_initwait+0x50/0x50
[500932.091471] ? compat_poll_select_copy_remaining+0x140/0x140
[500932.091473] ? compat_poll_select_copy_remaining+0x140/0x140
[500932.091476] ? seccomp_run_filters+0x57/0xc0
[500932.091479] ? proc_destroy_inode+0x1c/0x20
[500932.091480] ? destroy_inode+0x3b/0x60
[500932.091481] ? evict+0x136/0x1a0
[500932.091483] ? seccomp_filter+0x49/0x540
[500932.091484] ? dentry_free+0x38/0x70
[500932.091486] ? dentry_kill+0x118/0x170
[500932.091487] ? _cond_resched+0x1a/0x50
[500932.091488] ? dput+0x34/0x1f0
[500932.091490] ? mntput+0x24/0x40
[500932.091492] ? fput+0x190/0x220
[500932.091494] ? ktime_get_ts64+0x49/0xf0
[500932.091496] SyS_poll+0x71/0x130
[500932.091497] ? SyS_poll+0x71/0x130
[500932.091500] do_syscall_64+0x6d/0x120
[500932.091501] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[500932.091503] RIP: 0033:0x7ffff24d684d
[500932.091503] RSP: 002b:00007fff6d39a020 EFLAGS: 00000293 ORIG_RAX: 0000000000000007
[500932.091504] RAX: ffffffffffffffda RBX: 0000000008083de0 RCX: 00007ffff24d684d
[500932.091505] RDX: 0000000000000064 RSI: 0000000000000001 RDI: 0000000008083e30
[500932.091505] RBP: 0000000008083e30 R08: 0000000020000000 R09: 0000000008501180
[500932.091506] R10: 00000000000003df R11: 0000000000000293 R12: 00007fff6d39a0a0
[500932.091507] R13: 00007fffffffe4ff R14: 00007fff6d8109c0 R15: 0000000000000000
[500932.091507] Code: 01 00 00 49 8b 7d 70 8b 31 3b b7 30 01 00 00 4c 8b 97 28 01 00 00 0f 83 07 02 00 00 49 8b 3c f2 48 8b 71 08 49 23 b5 58 01 00 00 <49> 8b 8d 60 01 00 00 48 39 ce 48 0f 4
6 ce 48 89 ce 48 01 fe 0f
if it stock, server should be reboot.
hi I used ixgbe 5.15.2 for netmap ext-driver. when I run:
numactl -C 0-3 pkt-gen -f rx -c 4 -p 4 -i eth8
(eth8 is ixgbe interface) in dmesg there is a message, that says:
call Trace: netmap_poll+0x562/0x5d0 [netmap] ? ktime_get_ts64+0x49/0xxf0 linux_netmap_poll+0x3f/0x60 [netmap] do_sys_poll . . .
why? can you help me? some times it stock and can not be closed.
when I run this command in a script:
when I run this script after some iterations it is stock and I should reboot the server for this reason.
i run this script with previouse version of netmap for about 1 or 2 year ago. there is not any problem for working with ixgbe.please help to fix this problem