openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.
Apache License 2.0
663 stars 102 forks source link

Kernel panic on 6.1.14 in nvme_tcp #1330

Open Szpadel opened 1 year ago

Szpadel commented 1 year ago

Describe the bug When io-engine dies for whatever reason, there is nothing logged in before crash. All app nodes using mounted volume crash (not only one where io-engine was running)

To Reproduce I do not know exact trigger, but was able to reproduce it by deleting pod with io-engine

Expected behavior This could be kernel bug but, I don't know implementation details of mayastor for reporting it there.

Screenshots Reproduced by killing pod:

[ 2437.197218] nvme nvme4: starting error recovery
[ 2437.197835] nvme nvme4: Reconnecting in 10 seconds...
[ 2437.203659] nvme nvme3: starting error recovery
[ 2437.204157] nvme nvme3: Reconnecting in 10 seconds...
[ 2437.207250] nvme nvme0: starting error recovery
[ 2437.207703] nvme nvme0: Reconnecting in 10 seconds...
[ 2437.210338] nvme nvme1: starting error recovery
[ 2437.210768] nvme nvme1: Reconnecting in 10 seconds...
[ 2437.213314] nvme nvme2: starting error recovery
[ 2437.214038] nvme nvme2: Reconnecting in 10 seconds...
[ 2438.055584] block nvme1n1: no usable path - requeuing I/O
[ 2438.781645] BUG: kernel NULL pointer dereference, address: 0000000000000038
[ 2438.781848] #PF: supervisor read access in kernel mode
[ 2438.781996] #PF: error_code(0x0000) - not-present page
[ 2438.782144] PGD 0 P4D 0
[ 2438.782273] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 2438.782422] CPU: 4 PID: 9790 Comm: agent-ha-node Not tainted 6.1.14-200.fc37.x86_64 #1
[ 2438.782588] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[ 2438.782746] RIP: 0010:kernel_getsockname+0xb/0x20
[ 2438.782920] Code: 1f 44 00 00 48 8b 47 20 48 8b 40 20 ff e0 cc 66 90 cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 48 8b 47 20 31 d2 <48> 8b 40 38 ff e0 cc 66 90 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
[ 2438.783230] RSP: 0018:ffffa54fc8a3bcf0 EFLAGS: 00010246
[ 2438.783377] RAX: 0000000000000000 RBX: 000000000000001f RCX: 0000000000000001
[ 2438.783533] RDX: 0000000000000000 RSI: ffffa54fc8a3bcf8 RDI: ffff914dbd047b80
[ 2438.783695] RBP: 0000000000001000 R08: 0000000000000004 R09: ffff914b4bc3b01e
[ 2438.783842] R10: ffffffffffffffff R11: 0000000000000000 R12: ffffa54fc8a3bcf8
[ 2438.783990] R13: ffff914b4bc3b000 R14: ffff914d61ccb400 R15: 0000000000000001
[ 2438.784143] FS:  00007fb9e69ca540(0000) GS:ffff915a3f900000(0000) knlGS:0000000000000000
[ 2438.784292] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2438.784438] CR2: 0000000000000038 CR3: 0000000f5247a001 CR4: 0000000000170ee0
[ 2438.784592] Call Trace:
[ 2438.784721]  <TASK>
[ 2438.784840]  nvme_tcp_get_address+0x59/0xd0 [nvme_tcp]
[ 2438.785010]  nvme_sysfs_show_address+0x1b/0x30 [nvme_core]
[ 2438.785217]  dev_attr_show+0x15/0x40
[ 2438.785396]  sysfs_kf_seq_show+0xa0/0xe0
[ 2438.785561]  seq_read_iter+0x11f/0x450
[ 2438.785739]  vfs_read+0x217/0x2f0
[ 2438.785899]  ksys_read+0x5b/0xd0
[ 2438.786044]  do_syscall_64+0x58/0x80
[ 2438.786208]  ? do_syscall_64+0x67/0x80
[ 2438.786355]  ? do_syscall_64+0x67/0x80
[ 2438.786504]  ? do_syscall_64+0x67/0x80
[ 2438.786654]  ? do_syscall_64+0x67/0x80
[ 2438.786799]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 2438.786951] RIP: 0033:0x7fb9e6acd7c4
[ 2438.787169] Code: 84 00 00 00 00 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b 9e f8 ff 4c 89 e2 48 89 ee 89 df 41 89 c0 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 3c 44 89 c7 48 89 44 24 08 e8 67 9e f8 ff 48
[ 2438.787477] RSP: 002b:00007ffe1a90d910 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 2438.787632] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007fb9e6acd7c4
[ 2438.787783] RDX: 0000000000001000 RSI: 000055bc105e8780 RDI: 000000000000000c
[ 2438.787938] RBP: 000055bc105e8780 R08: 0000000000000000 R09: 0000000000000000
[ 2438.788087] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000001000
[ 2438.788238] R13: 0000000000000000 R14: 00007ffe1a90da50 R15: 000000000000000c
[ 2438.788388]  </TASK>                                [ 2438.788507] Modules linked in: tls nfsd nfs_acl xt_multiport xt_set ipt_rpfilter ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel bpf_preload wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
nfs lockd grace fscache netfs veth nf_conntrack_netlink xt_addrtype xt_nat xt_statistic ipt_REJECT ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs xt_MASQUERADE xt_mark xt_conntrack xt_comment nft_compat overlay nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ip
v4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink sunrpc vfat intel_rapl_msr fat intel_rapl_common kvm_intel xfs kvm irqbypass rapl i2c_piix4 joydev virtio_balloon tcp_bbr sch_fq nvme_tcp n
vme_fabrics nvme_core nvme_common loop zram crct10dif_pclmul crc32_pclmul virtio_net crc32c_intel polyval_clmulni bochs polyval_generic drm_vram_helper ghash_clmulni_intel net_failover
[ 2438.788698]  drm_ttm_helper ttm sha512_ssse3 failover virtio_console virtio_scsi serio_raw ata_generic pata_acpi fuse qemu_fw_cfg
[ 2438.792182] CR2: 0000000000000038
[ 2438.792605] ---[ end trace 0000000000000000 ]---
[ 2438.793066] RIP: 0010:kernel_getsockname+0xb/0x20
[ 2438.793463] Code: 1f 44 00 00 48 8b 47 20 48 8b 40 20 ff e0 cc 66 90 cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 48 8b 47 20 31 d2 <48> 8b 40 38 ff e0 cc 66 90 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
[ 2438.794286] RSP: 0018:ffffa54fc8a3bcf0 EFLAGS: 00010246
[ 2438.794767] RAX: 0000000000000000 RBX: 000000000000001f RCX: 0000000000000001
[ 2438.795176] RDX: 0000000000000000 RSI: ffffa54fc8a3bcf8 RDI: ffff914dbd047b80
[ 2438.795616] RBP: 0000000000001000 R08: 0000000000000004 R09: ffff914b4bc3b01e
[ 2438.796023] R10: ffffffffffffffff R11: 0000000000000000 R12: ffffa54fc8a3bcf8
[ 2438.796401] R13: ffff914b4bc3b000 R14: ffff914d61ccb400 R15: 0000000000000001
[ 2438.796806] FS:  00007fb9e69ca540(0000) GS:ffff915a3f900000(0000) knlGS:0000000000000000
[ 2438.797233] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2438.797867] CR2: 0000000000000038 CR3: 0000000f5247a001 CR4: 0000000000170ee0

caught happen organically:

[  503.657524] nvme nvme0: queue 0: timeout request 0x0 type 4
[  503.657532] nvme nvme0: starting error recovery
[  503.657881] nvme nvme0: failed nvme_keep_alive_end_io error=10
[  503.664549] nvme nvme0: Reconnecting in 10 seconds...
[  504.425613] nvme nvme1: queue 0: timeout request 0x0 type 4
[  504.425621] nvme nvme1: starting error recovery
[  504.425935] nvme nvme1: failed nvme_keep_alive_end_io error=10
[  504.432515] nvme nvme1: Reconnecting in 10 seconds...
[  505.129536] nvme nvme2: queue 0: timeout request 0x0 type 4
[  505.129544] nvme nvme2: starting error recovery
[  505.129806] nvme nvme2: failed nvme_keep_alive_end_io error=10
[  505.136528] nvme nvme2: Reconnecting in 10 seconds...
[  506.455595] BUG: kernel NULL pointer dereference, address: 0000000000000038
[  506.455687] #PF: supervisor read access in kernel mode
[  506.455741] #PF: error_code(0x0000) - not-present page
[  506.455793] PGD 0 P4D 0
[  506.455838] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  506.455890] CPU: 6 PID: 7291 Comm: agent-ha-node Not tainted 6.1.14-200.fc37.x86_64 #1
[  506.455949] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[  506.456005] RIP: 0010:kernel_getsockname+0xb/0x20
[  506.456072] Code: 1f 44 00 00 48 8b 47 20 48 8b 40 20 ff e0 cc 66 90 cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 48 8b 47 20 31 d2 <48> 8b 40 38 ff e0 cc 66 90 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
[  506.456182] RSP: 0018:ffffade047e37d00 EFLAGS: 00010246
[  506.456234] RAX: 0000000000000000 RBX: 000000000000001f RCX: 0000000000000001
[  506.456287] RDX: 0000000000000000 RSI: ffffade047e37d08 RDI: ffff8fb64c7afb80
[  506.456339] RBP: 0000000000001000 R08: 0000000000000004 R09: ffff8fb64390501e
[  506.456392] R10: ffffffffffffffff R11: 0000000000000000 R12: ffffade047e37d08
[  506.456445] R13: ffff8fb643905000 R14: ffff8fb686cbb000 R15: 0000000000000001
[  506.456502] FS:  00007f8080e99540(0000) GS:ffff8fc53f980000(0000) knlGS:0000000000000000
[  506.456561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  506.456613] CR2: 0000000000000038 CR3: 0000000190e48001 CR4: 0000000000170ee0
[  506.456669] Call Trace:
[  506.456710]  <TASK>
[  506.456752]  nvme_tcp_get_address+0x59/0xd0 [nvme_tcp]
[  506.456828]  nvme_sysfs_show_address+0x1b/0x30 [nvme_core]
[  506.456924]  dev_attr_show+0x15/0x40
[  506.456984]  sysfs_kf_seq_show+0xa0/0xe0
[  506.457046]  seq_read_iter+0x11f/0x450
[  506.457100]  vfs_read+0x217/0x2f0
[  506.457154]  ksys_read+0x5b/0xd0
[  506.457204]  do_syscall_64+0x58/0x80
[  506.457267]  ? do_syscall_64+0x67/0x80
[  506.457329]  ? do_syscall_64+0x67/0x80
[  506.457379]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  506.457436] RIP: 0033:0x7f8080f9c7c4
[  506.457572] Code: 84 00 00 00 00 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b 9e f8 ff 4c 89 e2 48 89 ee 89 df 41 89 c0 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 3c 44 89 c7 48 89 44 24 08 e8 67 9e f8 ff 48
[  506.457684] RSP: 002b:00007ffe7fade290 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  506.457740] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f8080f9c7c4
[  506.457792] RDX: 0000000000001000 RSI: 000055566374d190 RDI: 000000000000000b
[  506.457844] RBP: 000055566374d190 R08: 0000000000000000 R09: 0000000000000000
[  506.457897] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000001000
[  506.457949] R13: 0000000000000000 R14: 00007ffe7fade3d0 R15: 000000000000000b
[  506.458003]  </TASK>
[  506.458045] Modules linked in: iptable_nat iptable_filter br_netfilter bridge stp llc ip_tables xt_set xt_multiport ipt_rpfilter ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel bpf_preload wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel
 veth nf_conntrack_netlink xt_addrtype xt_statistic xt_nat ipt_REJECT ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs xt_MASQUERADE xt_mark xt_conntrack xt_comment nft_compat overlay nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject
nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink sunrpc vfat fat intel_rapl_msr intel_rapl_common xfs kvm_intel kvm joydev irqbypass rapl virtio_balloon i2c_piix4 tcp_bbr sch_fq nvme_tcp nvme_fabrics nvme_core nvme_co
mmon loop zram crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic bochs virtio_net drm_vram_helper ghash_clmulni_intel drm_ttm_helper ttm sha512_ssse3
[  506.458211]  net_failover virtio_console virtio_scsi failover serio_raw ata_generic pata_acpi fuse qemu_fw_cfg
[  506.460551] CR2: 0000000000000038
[  506.460979] ---[ end trace 0000000000000000 ]---
[  506.461392] RIP: 0010:kernel_getsockname+0xb/0x20
[  506.461812] Code: 1f 44 00 00 48 8b 47 20 48 8b 40 20 ff e0 cc 66 90 cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 48 8b 47 20 31 d2 <48> 8b 40 38 ff e0 cc 66 90 cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f
[  506.462642] RSP: 0018:ffffade047e37d00 EFLAGS: 00010246
[  506.463047] RAX: 0000000000000000 RBX: 000000000000001f RCX: 0000000000000001
[  506.463450] RDX: 0000000000000000 RSI: ffffade047e37d08 RDI: ffff8fb64c7afb80
[  506.463874] RBP: 0000000000001000 R08: 0000000000000004 R09: ffff8fb64390501e
[  506.464258] R10: ffffffffffffffff R11: 0000000000000000 R12: ffffade047e37d08
[  506.464661] R13: ffff8fb643905000 R14: ffff8fb686cbb000 R15: 0000000000000001
[  506.465043] FS:  00007f8080e99540(0000) GS:ffff8fc53f980000(0000) knlGS:0000000000000000
[  506.465458] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  506.465869] CR2: 0000000000000038 CR3: 0000000190e48001 CR4: 0000000000170ee0

OS info (please complete the following information):

Additional context Not sure if that is any relevant: There are 3 nodes running on proxmox: 1 master and 2 worker nodes each worked node is pinned to one physical cpu only one node is handling storage

Let me know if you need any other info or have any ideas how to debug this further Each worker node is bount

Szpadel commented 1 year ago

I caught this time another crash message on node with io-engine (this might not be io-engine crashing, but whole node and it's brings all other with bug in nvme_tcp)

[  188.144797] usercopy: Kernel memory exposure attempt detected from page alloc (offset 0, size 24576)!
[  188.145098] ------------[ cut here ]------------
[  188.145102] kernel BUG at mm/usercopy.c:101!
[  188.145205] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  188.145271] CPU: 2 PID: 10570 Comm: lcore-worker-2 Not tainted 6.1.14-200.fc37.x86_64 #1
[  188.145346] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[  188.145424] RIP: 0010:usercopy_abort+0x75/0x77
[  188.145511] Code: 6f a8 51 48 0f 45 d6 48 89 c1 49 c7 c3 b8 80 71 a8 41 52 48 c7 c6 84 87 6f a8 48 c7 c7 58 80 71 a8 49 0f 45 f3 e8 2f 51 ff ff <0f> 0b 48 89 f1 49 89 e8 44 89 e2 31 f6 48 c7 c7 02 81 71 a8 e8 72
[  188.145662] RSP: 0018:ffffb23d4ba8fa18 EFLAGS: 00010286
[  188.145718] RAX: 0000000000000059 RBX: ffff8e67058b8000 RCX: 0000000000000000
[  188.145778] RDX: 0000000000000001 RSI: ffffffffa8749b33 RDI: 00000000ffffffff
[  188.145844] RBP: 0000000000006000 R08: 0000000000000000 R09: ffffb23d4ba8f8c8
[  188.145902] R10: 0000000000000003 R11: ffffffffa9147448 R12: 0000000000000001
[  188.145946] R13: ffff8e67058be000 R14: 0000000000000018 R15: ffff8e66448c4ed0
[  188.145991] FS:  00007f9ab0f30640(0000) GS:ffff8e6e8f880000(0000) knlGS:0000000000000000
[  188.146062] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  188.146122] CR2: 00005635efc7a008 CR3: 00000003647f0004 CR4: 0000000000170ee0
[  188.146196] Call Trace:
[  188.146247]  <TASK>
[  188.146277]  __check_object_size.cold+0x17/0xcb
[  188.146333]  simple_copy_to_iter+0x25/0x40
[  188.146373]  __skb_datagram_iter+0x19e/0x2f0
[  188.146402]  ? skb_free_datagram+0x10/0x10
[  188.146449]  skb_copy_datagram_iter+0x30/0x90
[  188.146506]  ? avc_has_perm+0xa7/0x190
[  188.146563]  tcp_recvmsg_locked+0x254/0x8f0
[  188.146610]  tcp_recvmsg+0x75/0x1d0
[  188.146633]  inet_recvmsg+0x42/0x100
[  188.146660]  ? sock_recvmsg+0x1c/0x70
[  188.146682]  sock_read_iter+0x84/0xd0
[  188.146699]  do_iter_readv_writev+0x112/0x130
[  188.146719]  do_iter_read+0xe8/0x1e0
[  188.146735]  vfs_readv+0x95/0xc0
[  188.146750]  do_readv+0xd2/0x130
[  188.146765]  do_syscall_64+0x58/0x80
[  188.146784]  ? do_readv+0xef/0x130
[  188.146797]  ? syscall_exit_to_user_mode+0x17/0x40
[  188.146814]  ? do_syscall_64+0x67/0x80
[  188.146831]  ? do_syscall_64+0x67/0x80
[  188.146844]  ? do_syscall_64+0x67/0x80
[  188.146861]  ? do_syscall_64+0x67/0x80
[  188.146877]  ? do_syscall_64+0x67/0x80
[  188.146891]  ? do_syscall_64+0x67/0x80
[  188.146906]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  188.146926] RIP: 0033:0x7f9ab1d38b77
[  188.146964] Code: 1f 40 00 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 bb fc f8 ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 13 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 14 fd f8 ff 48
[  188.148093] RSP: 002b:00007f9ab0f2dd50 EFLAGS: 00000293 ORIG_RAX: 0000000000000013
[  188.148638] RAX: ffffffffffffffda RBX: 0000000000000224 RCX: 00007f9ab1d38b77
[  188.149151] RDX: 0000000000000002 RSI: 00007f9ab0f2dd80 RDI: 0000000000000224
[  188.149685] RBP: 00007f9ab0f2dd80 R08: 0000000000000000 R09: 000020003d4183c0
[  188.150120] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000002
[  188.150418] R13: 00007f9ab0f2dde0 R14: 00007f9ab0f2dd80 R15: 0000000000008240
[  188.150943]  </TASK>
[  188.151409] Modules linked in: xt_multiport xt_set ipt_rpfilter ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel bpf_preload rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel veth nf_conntrack_netlink xt_addrtype xt_statistic xt_nat ipt_REJECT ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs xt_MASQUERADE xt_mark xt_conntrack xt_comment nft_compat overlay nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink sunrpc vfat fat intel_rapl_msr intel_rapl_common kvm_intel xfs kvm irqbypass rapl i2c_piix4 joydev tcp_bbr sch_fq nvme_tcp nvme_fabrics nvme_core nvme_common loop zram crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic bochs virtio_net drm_vram_helper drm_ttm_helper ghash_clmulni_intel net_failover ttm sha512_ssse3
[  188.151590]  virtio_console failover virtio_scsi serio_raw ata_generic pata_acpi fuse qemu_fw_cfg
[  188.154797] ---[ end trace 0000000000000000 ]---
[  188.155218] RIP: 0010:usercopy_abort+0x75/0x77
[  188.155566] Code: 6f a8 51 48 0f 45 d6 48 89 c1 49 c7 c3 b8 80 71 a8 41 52 48 c7 c6 84 87 6f a8 48 c7 c7 58 80 71 a8 49 0f 45 f3 e8 2f 51 ff ff <0f> 0b 48 89 f1 49 89 e8 44 89 e2 31 f6 48 c7 c7 02 81 71 a8 e8 72
[  188.156309] RSP: 0018:ffffb23d4ba8fa18 EFLAGS: 00010286
[  188.156693] RAX: 0000000000000059 RBX: ffff8e67058b8000 RCX: 0000000000000000
[  188.157058] RDX: 0000000000000001 RSI: ffffffffa8749b33 RDI: 00000000ffffffff
[  188.157508] RBP: 0000000000006000 R08: 0000000000000000 R09: ffffb23d4ba8f8c8
[  188.158090] R10: 0000000000000003 R11: ffffffffa9147448 R12: 0000000000000001
[  188.158881] R13: ffff8e67058be000 R14: 0000000000000018 R15: ffff8e66448c4ed0
[  188.159290] FS:  00007f9ab0f30640(0000) GS:ffff8e6e8f880000(0000) knlGS:0000000000000000
[  188.159648] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  188.160020] CR2: 00005635efc7a008 CR3: 00000003647f0004 CR4: 0000000000170ee0

This crash is in lcore-worker-2 and if i'm understanding correctly this is part of io-engine

Szpadel commented 1 year ago

Same with kernel 6.1.15

Szpadel commented 1 year ago

~usercopy issue seems to be not triggered on 6.0.7~ It just triggered

tiagolobocastro commented 1 year ago

Hi @Szpadel, don't think we've ever seen this but also don't think we've tested kernel v6, maybe @blaisedias can confirm. @Szpadel would you be able to test with a v5 kernel?

blaisedias commented 1 year ago

@tiagolobocastro yes, we haven't tested with v6 kernels.

Szpadel commented 1 year ago

@tiagolobocastro it's now running on 5.15.97 as I have no idea how to trigger this usercopy bug, I'll report if it triggers

Szpadel commented 1 year ago

Looks like usercopy bug does not happen on 5.15.97

laibe commented 1 year ago

Thanks @Szpadel for reporting this, I've observed the same kernel panic on talos 1.3.5 with kernel 5.15.94 just now.

tiagolobocastro commented 1 year ago

@laibe could you please share how you were able to hit this on v5? @blaisedias what specific kernel versions do we test with?

laibe commented 1 year ago

Unfortunately I did not have kernel log delivery set up so I only see the last couple of lines via IPMI (see below) since talos afaik does not keep dmesg logs from previous boots. Essentially the whole node crashed and I had to do a hard reset. kernel_panic_ms

Szpadel commented 1 year ago

@laibe in my case logs was always cut before panic happened and I was catching them by keeping ssh session with dmesg -w

laibe commented 1 year ago

In my case it happened during the night and I was not expecting it to happen otherwise I would have logged dmesg. I'd like to do some further testing with the latest talos release (kernel 5.15.102)

tiagolobocastro commented 1 year ago

Given that you've now also reproduced this on v5, I also note that we don't really test on talos, mostly just ubuntu AFAIK. Maybe the bug is simply more easily reproducible on talos? Anyway, someone from our team is getting up to speed with talos, so hopefully we should be able to do some testing of our own, and try to reproduce these :crossed_fingers:

laibe commented 1 year ago

Bare metal cluster with talos are relatively easy to set up so it might even turn out as a convenient way for your test setups :slightly_smiling_face:. I don't have a bare metal cluster to play around with at the moment otherwise I could assist with further debugging.

tiagolobocastro commented 1 year ago

@datacore-tilangovan maybe @laibe could help with any issues you may find, if he's willing and able ofcourse, no pressure :)

laibe commented 1 year ago

I've pinged @datacore-bolt-ci in the Mayastor discord, as mentioned unfortunately I don't have a bare metal cluster available for debugging at the moment but happy to help in regards to the set up of talos and mayastor on talos.

tiagolobocastro commented 5 months ago

@blaisedias can we run tests on kernel v6?

vincedihub commented 3 months ago

On Talos 1.6.7 - Kernel 6.1.82 - https://github.com/siderolabs/talos/releases/tag/v1.6.7

Every time I declare a DiskPool I get a Kernel Panic

10.0.0.125: kern:     err: [2024-03-24T16:39:46.213509249Z]: rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
10.0.0.125: kern:     err: [2024-03-24T16:39:46.214733249Z]: rcu:       1-...0: (1 GPs behind) idle=bb1c/1/0x4000000000000000 softirq=11407/11411 fqs=10527
10.0.0.125: kern: warning: [2024-03-24T16:39:46.216561249Z]:    (detected by 2, t=52518 jiffies, g=16029, q=263999 ncpus=16)
10.0.0.125: kern:    info: [2024-03-24T16:39:46.217919249Z]: Sending NMI from CPU 2 to CPUs 1:
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218811249Z]: NMI backtrace for cpu 1
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218813249Z]: CPU: 1 PID: 5624 Comm: io-engine Not tainted 6.1.82-talos #1
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218819249Z]: Hardware name: Intel(R) Client Systems NUC12WSHi5/NUC12WSBi5, BIOS WSADL357.0085.2022.0718.1739 07/18/2022
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218820249Z]: RIP: 0010:swiotlb_tbl_map_single+0x25c/0x660
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218828249Z]: Code: ed 75 75 4d 85 f6 75 02 0f 0b 4d 8d 56 ff 4c 89 f1 4c 89 64 24 20 4c 21 d1 48 89 4c 24 30 4c 8b 5c 24 30 4d 85 db 75 e0 89 e9 <4c> 8b 64 24 08 49 8d 14 09 4c 21 d2 4c 01 e2 49 39 d6 72 1e 48 8d
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218829249Z]: RSP: 0018:ffff9d95c0a03878 EFLAGS: 00000046
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218832249Z]: RAX: 00000000000006e1 RBX: 0000000000000004 RCX: 0000000000000ee1
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218833249Z]: RDX: 000000022a51d600 RSI: 000000000000004c RDI: 0000000000000800
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218833249Z]: RBP: 0000000000000ee1 R08: 0000000000000800 R09: 000000000007ed3a
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218834249Z]: R10: 00000000001fffff R11: 0000000000000000 R12: 000000003f69d000
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218834249Z]: R13: 0000000215b10e00 R14: 0000000000200000 R15: ffffffffb56a1f00
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218835249Z]: FS:  00007fd88ec13dc0(0000) GS:ffff9ac9d7640000(0000) knlGS:0000000000000000
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218836249Z]: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218837249Z]: CR2: 000055a195861070 CR3: 00000002588fe000 CR4: 0000000000750ee0
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218837249Z]: PKRU: 55555554
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218838249Z]: Call Trace:
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218840249Z]:  <NMI>
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218843249Z]:  ? nmi_cpu_backtrace.cold+0x32/0x68
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218847249Z]:  ? nmi_cpu_backtrace_handler+0xd/0x20
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218850249Z]:  ? nmi_handle+0x53/0xf0
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218853249Z]:  ? default_do_nmi+0x40/0x120
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218854249Z]:  ? exc_nmi+0xfe/0x130
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218855249Z]:  ? end_repeat_nmi+0x16/0x67
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218860249Z]:  ? swiotlb_tbl_map_single+0x25c/0x660
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218862249Z]:  ? swiotlb_tbl_map_single+0x25c/0x660
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218863249Z]:  ? swiotlb_tbl_map_single+0x25c/0x660
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218865249Z]:  </NMI>
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218865249Z]:  <TASK>
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218866249Z]:  iommu_dma_map_page+0x18c/0x230
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218871249Z]:  iommu_dma_map_sg+0x219/0x410
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218873249Z]:  __dma_map_sg_attrs+0x26/0x90
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218875249Z]:  dma_map_sgtable+0x19/0x30
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218876249Z]:  nvme_map_data+0xd7/0x860
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218881249Z]:  nvme_queue_rqs+0xce/0x280
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218884249Z]:  blk_mq_flush_plug_list.part.0+0x1f8/0x2a0
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218887249Z]:  __blk_flush_plug+0xf1/0x150
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218889249Z]:  blk_finish_plug+0x25/0x40
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218890249Z]:  blkdev_write_iter+0x12a/0x1a0
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218893249Z]:  aio_write+0x15b/0x270
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218897249Z]:  ? io_submit_one+0x467/0x7b0
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218899249Z]:  io_submit_one+0x467/0x7b0
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218901249Z]:  __x64_sys_io_submit+0xa9/0x170
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218903249Z]:  do_syscall_64+0x59/0x90
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218905249Z]:  entry_SYSCALL_64_after_hwframe+0x64/0xce
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218908249Z]: RIP: 0033:0x7fd88ed61d3d
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218910249Z]: Code: 0c eb d3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 20 0f 00 f7 d8 64 89 01 48
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218911249Z]: RSP: 002b:00007ffe548f77d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218912249Z]: RAX: ffffffffffffffda RBX: 00007fd88ec126b8 RCX: 00007fd88ed61d3d
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218913249Z]: RDX: 00007ffe548f7828 RSI: 0000000000000001 RDI: 00007fd88c16c000
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218913249Z]: RBP: 00007fd88c16c000 R08: 0000000000000008 R09: 000055a195fdff50
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218914249Z]: R10: 000020001a51e7c0 R11: 0000000000000246 R12: 0000000000000001
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218914249Z]: R13: 0000000000000000 R14: 00007ffe548f7828 R15: 0000000000000000
10.0.0.125: kern: warning: [2024-03-24T16:39:46.218915249Z]:  </TASK>
tiagolobocastro commented 3 months ago

This seems different, seems to have whilst creating the diskpool!? I'm on 6.1.81 and I don't have this issue, I'll check if I can upgrade to 6.1.82 and see what happens.. Although I'm not on TALOS..