Closed matttbe closed 8 months ago
Closing this as this is not related to MPTCP.
I will report it to netdev ML if I manage to reproduce it.
I had probably the same issue, just after having sent a ping (in v6 I suppose), this time with a decoded stacktrace, still on top of net
:
[ 45.505495] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 45.505547] CPU: 1 PID: 1070 Comm: ping Tainted: G N 6.7.0-g244ee3389ffe #1
[ 45.505547] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 45.505547] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 45.505547] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 17 9d 11
All code
========
0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
7: 00
8: 0f 1f 40 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 55 push %rbp
12: 48 89 fd mov %rdi,%rbp
15: 48 83 ec 20 sub $0x20,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 18 mov %rax,0x18(%rsp)
27: 31 c0 xor %eax,%eax
29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
2e: 66 90 xchg %ax,%ax
30: 66 90 xchg %ax,%ax
32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
37: 48 89 ef mov %rbp,%rdi
3a: 65 gs
3b: 8b .byte 0x8b
3c: 35 .byte 0x35
3d: 17 (bad)
3e: 9d popf
3f: 11 .byte 0x11
Code starting with the faulting instruction
===========================================
0: c9 leave
1: 00 00 add %al,(%rax)
3: 00 66 90 add %ah,-0x70(%rsi)
6: 66 90 xchg %ax,%ax
8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
d: 48 89 ef mov %rbp,%rdi
10: 65 gs
11: 8b .byte 0x8b
12: 35 .byte 0x35
13: 17 (bad)
14: 9d popf
15: 11 .byte 0x11
[ 45.505547] RSP: 0018:ffffb106c00f0af8 EFLAGS: 00000246
[ 45.505547] RAX: 0000000000000000 RBX: ffff99918827b000 RCX: 0000000000000000
[ 45.505547] RDX: 000000000000000a RSI: ffff99918827d000 RDI: ffff9991819e6400
[ 45.505547] RBP: ffff9991819e6400 R08: 0000000000000000 R09: 0000000000000068
[ 45.505547] R10: ffff999181c104c0 R11: 736f6d6570736575 R12: ffff9991819e6400
[ 45.505547] R13: 0000000000000076 R14: 0000000000000000 R15: ffff99918827c000
[ 45.505547] FS: 00007fa1d06ca1c0(0000) GS:ffff9991fdc80000(0000) knlGS:0000000000000000
[ 45.505547] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 45.505547] CR2: 0000559b91aac240 CR3: 0000000004986000 CR4: 00000000000006f0
[ 45.505547] Call Trace:
[ 45.505547] <IRQ>
[ 45.505547] ? die (arch/x86/kernel/dumpstack.c:421)
[ 45.505547] ? exc_int3 (arch/x86/kernel/traps.c:762)
[ 45.505547] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569)
[ 45.505547] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 45.505547] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 45.505547] __netif_rx (net/core/dev.c:5084)
[ 45.505547] veth_xmit (drivers/net/veth.c:321)
[ 45.505547] dev_hard_start_xmit (include/linux/netdevice.h:4989)
[ 45.505547] __dev_queue_xmit (include/linux/netdevice.h:3367)
[ 45.505547] ? selinux_ip_postroute_compat (security/selinux/hooks.c:5783)
[ 45.505547] ? eth_header (net/ethernet/eth.c:85)
[ 45.505547] ip6_finish_output2 (include/net/neighbour.h:542)
[ 45.505547] ? ip6_output (include/linux/netfilter.h:301)
[ 45.505547] ? ip6_mtu (net/ipv6/route.c:3208)
[ 45.505547] ip6_send_skb (net/ipv6/ip6_output.c:1953)
[ 45.505547] icmpv6_echo_reply (net/ipv6/icmp.c:812)
[ 45.505547] ? icmpv6_rcv (net/ipv6/icmp.c:939)
[ 45.505547] icmpv6_rcv (net/ipv6/icmp.c:939)
[ 45.505547] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:440)
[ 45.505547] ip6_input_finish (include/linux/rcupdate.h:779)
[ 45.505547] __netif_receive_skb_one_core (net/core/dev.c:5537)
[ 45.505547] process_backlog (include/linux/rcupdate.h:779)
[ 45.505547] __napi_poll (net/core/dev.c:6576)
[ 45.505547] net_rx_action (net/core/dev.c:6647)
[ 45.505547] __do_softirq (arch/x86/include/asm/jump_label.h:27)
[ 45.505547] do_softirq (kernel/softirq.c:454)
[ 45.505547] </IRQ>
[ 45.505547] <TASK>
[ 45.505547] __local_bh_enable_ip (kernel/softirq.c:381)
[ 45.505547] __dev_queue_xmit (net/core/dev.c:4379)
[ 45.505547] ip6_finish_output2 (include/linux/netdevice.h:3171)
[ 45.505547] ? ip6_output (include/linux/netfilter.h:301)
[ 45.505547] ? ip6_mtu (net/ipv6/route.c:3208)
[ 45.505547] ip6_send_skb (net/ipv6/ip6_output.c:1953)
[ 45.505547] rawv6_sendmsg (net/ipv6/raw.c:584)
[ 45.505547] ? netfs_clear_subrequests (include/linux/list.h:373)
[ 45.505547] ? netfs_alloc_request (fs/netfs/objects.c:42)
[ 45.505547] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206)
[ 45.505547] ? set_pte_range (mm/memory.c:4529)
[ 45.505547] ? next_uptodate_folio (include/linux/xarray.h:1699)
[ 45.505547] ? __sock_sendmsg (net/socket.c:733)
[ 45.505547] __sock_sendmsg (net/socket.c:733)
[ 45.505547] ? move_addr_to_kernel.part.0 (net/socket.c:253)
[ 45.505547] __sys_sendto (net/socket.c:2191)
[ 45.505547] ? __hrtimer_run_queues (include/linux/seqlock.h:566)
[ 45.505547] ? __do_softirq (arch/x86/include/asm/jump_label.h:27)
[ 45.505547] __x64_sys_sendto (net/socket.c:2203)
[ 45.505547] do_syscall_64 (arch/x86/entry/common.c:52)
[ 45.505547] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
[ 45.505547] RIP: 0033:0x7fa1d099ca0a
[ 45.505547] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
All code
========
0: d8 64 89 02 fsubs 0x2(%rcx,%rcx,4)
4: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
b: eb b8 jmp 0xffffffffffffffc5
d: 0f 1f 00 nopl (%rax)
10: f3 0f 1e fa endbr64
14: 41 89 ca mov %ecx,%r10d
17: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
1e: 00
1f: 85 c0 test %eax,%eax
21: 75 15 jne 0x38
23: b8 2c 00 00 00 mov $0x2c,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 7e ja 0xb0
32: c3 ret
33: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
38: 41 54 push %r12
3a: 48 83 ec 30 sub $0x30,%rsp
3e: 44 rex.R
3f: 89 .byte 0x89
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 7e ja 0x86
8: c3 ret
9: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
e: 41 54 push %r12
10: 48 83 ec 30 sub $0x30,%rsp
14: 44 rex.R
15: 89 .byte 0x89
[ 45.505547] RSP: 002b:00007ffe47710958 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 45.505547] RAX: ffffffffffffffda RBX: 00007ffe47712090 RCX: 00007fa1d099ca0a
[ 45.505547] RDX: 0000000000000040 RSI: 0000559b91bbd300 RDI: 0000000000000003
[ 45.505547] RBP: 0000559b91bbd300 R08: 00007ffe477142a4 R09: 000000000000001c
[ 45.505547] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe47711c20
[ 45.505547] R13: 0000000000000040 R14: 0000559b91bbf4f4 R15: 00007ffe47712090
[ 45.505547] </TASK>
[ 45.505547] Modules linked in: mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[ 45.505547] ---[ end trace 0000000000000000 ]---
[ 45.505547] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 45.505547] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 17 9d 11
All code
========
0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
7: 00
8: 0f 1f 40 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 55 push %rbp
12: 48 89 fd mov %rdi,%rbp
15: 48 83 ec 20 sub $0x20,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 18 mov %rax,0x18(%rsp)
27: 31 c0 xor %eax,%eax
29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
2e: 66 90 xchg %ax,%ax
30: 66 90 xchg %ax,%ax
32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
37: 48 89 ef mov %rbp,%rdi
3a: 65 gs
3b: 8b .byte 0x8b
3c: 35 .byte 0x35
3d: 17 (bad)
3e: 9d popf
3f: 11 .byte 0x11
Code starting with the faulting instruction
===========================================
0: c9 leave
1: 00 00 add %al,(%rax)
3: 00 66 90 add %ah,-0x70(%rsi)
6: 66 90 xchg %ax,%ax
8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
d: 48 89 ef mov %rbp,%rdi
10: 65 gs
11: 8b .byte 0x8b
12: 35 .byte 0x35
13: 17 (bad)
14: 9d popf
15: 11 .byte 0x11
[ 45.505547] RSP: 0018:ffffb106c00f0af8 EFLAGS: 00000246
[ 45.505547] RAX: 0000000000000000 RBX: ffff99918827b000 RCX: 0000000000000000
[ 45.505547] RDX: 000000000000000a RSI: ffff99918827d000 RDI: ffff9991819e6400
[ 45.505547] RBP: ffff9991819e6400 R08: 0000000000000000 R09: 0000000000000068
[ 45.505547] R10: ffff999181c104c0 R11: 736f6d6570736575 R12: ffff9991819e6400
[ 45.505547] R13: 0000000000000076 R14: 0000000000000000 R15: ffff99918827c000
[ 45.505547] FS: 00007fa1d06ca1c0(0000) GS:ffff9991fdc80000(0000) knlGS:0000000000000000
[ 45.505547] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 45.505547] CR2: 0000559b91aac240 CR3: 0000000004986000 CR4: 00000000000006f0
[ 45.505547] Kernel panic - not syncing: Fatal exception in interrupt
[ 45.505547] Kernel Offset: 0x37600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7537561906/job/20516659466 export-net/20240116T054733 244ee3389ffe62920946feece271446d59c9dc92
EDIT: we have CONFIG_RPS=y
in the tests. For KConfig and the VMLinux, you can find them in the artifacts linked to the test job
Yet another one, on top of net-next + net:
# INFO: validating network environment with pings
[ 46.316504] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 46.316504] CPU: 0 PID: 1078 Comm: ping Tainted: G N 6.7.0-g2572fed72ac3 #1
[ 46.316504] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 46.316504] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 46.316504] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 d7 9c 31
All code
========
0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
7: 00
8: 0f 1f 40 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 55 push %rbp
12: 48 89 fd mov %rdi,%rbp
15: 48 83 ec 20 sub $0x20,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 18 mov %rax,0x18(%rsp)
27: 31 c0 xor %eax,%eax
29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
2e: 66 90 xchg %ax,%ax
30: 66 90 xchg %ax,%ax
32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
37: 48 89 ef mov %rbp,%rdi
3a: 65 gs
3b: 8b .byte 0x8b
3c: 35 .byte 0x35
3d: d7 xlat %ds:(%rbx)
3e: 9c pushf
3f: 31 .byte 0x31
Code starting with the faulting instruction
===========================================
0: c9 leave
1: 00 00 add %al,(%rax)
3: 00 66 90 add %ah,-0x70(%rsi)
6: 66 90 xchg %ax,%ax
8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
d: 48 89 ef mov %rbp,%rdi
10: 65 gs
11: 8b .byte 0x8b
12: 35 .byte 0x35
13: d7 xlat %ds:(%rbx)
14: 9c pushf
15: 31 .byte 0x31
[ 46.316504] RSP: 0018:ffffb96ac0003af8 EFLAGS: 00000246
[ 46.316504] RAX: 0000000000000000 RBX: ffff9d8088424000 RCX: 0000000000000000
[ 46.316504] RDX: 000000000000000a RSI: ffff9d8088426000 RDI: ffff9d8081b1f400
[ 46.316504] RBP: ffff9d8081b1f400 R08: 0000000000000000 R09: 0000000000000000
[ 46.316504] R10: ffff9d8082338000 R11: 736f6d6570736575 R12: ffff9d8081b1f400
[ 46.316504] R13: 0000000000000076 R14: 0000000000000000 R15: ffff9d8088422000
[ 46.316504] FS: 00007f6e554d61c0(0000) GS:ffff9d80fdc00000(0000) knlGS:0000000000000000
[ 46.316504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 46.316504] CR2: 00005619ee27d240 CR3: 000000000548a000 CR4: 00000000000006f0
[ 46.316504] Call Trace:
[ 46.316504] <IRQ>
[ 46.316504] ? die (arch/x86/kernel/dumpstack.c:421)
[ 46.316504] ? exc_int3 (arch/x86/kernel/traps.c:762)
[ 46.316504] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569)
[ 46.316504] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 46.316504] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 46.316504] __netif_rx (net/core/dev.c:5084)
[ 46.316504] veth_xmit (drivers/net/veth.c:321)
[ 46.316504] dev_hard_start_xmit (include/linux/netdevice.h:4989)
[ 46.316504] __dev_queue_xmit (include/linux/netdevice.h:3367)
[ 46.316504] ? selinux_ip_postroute_compat (security/selinux/hooks.c:5783)
[ 46.316504] ip6_finish_output2 (include/linux/netdevice.h:3171)
[ 46.316504] ? ip6_output (include/linux/netfilter.h:301)
[ 46.316504] ? ip6_mtu (net/ipv6/route.c:3208)
[ 46.316504] ip6_send_skb (net/ipv6/ip6_output.c:1953)
[ 46.316504] icmpv6_echo_reply (net/ipv6/icmp.c:812)
[ 46.316504] ? icmpv6_rcv (net/ipv6/icmp.c:939)
[ 46.316504] icmpv6_rcv (net/ipv6/icmp.c:939)
[ 46.316504] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:440)
[ 46.316504] ip6_input_finish (include/linux/rcupdate.h:779)
[ 46.316504] __netif_receive_skb_one_core (net/core/dev.c:5537)
[ 46.316504] process_backlog (include/linux/rcupdate.h:779)
[ 46.316504] __napi_poll (net/core/dev.c:6576)
[ 46.316504] net_rx_action (net/core/dev.c:6647)
[ 46.316504] __do_softirq (arch/x86/include/asm/jump_label.h:27)
[ 46.316504] do_softirq (kernel/softirq.c:454)
[ 46.316504] </IRQ>
[ 46.316504] <TASK>
[ 46.316504] __local_bh_enable_ip (kernel/softirq.c:381)
[ 46.316504] __dev_queue_xmit (net/core/dev.c:4379)
[ 46.316504] ? selinux_ip_postroute_compat (security/selinux/hooks.c:5783)
[ 46.316504] ip6_finish_output2 (include/linux/netdevice.h:3171)
[ 46.316504] ? ip6_output (include/linux/netfilter.h:301)
[ 46.316504] ? ip6_mtu (net/ipv6/route.c:3208)
[ 46.316504] ip6_send_skb (net/ipv6/ip6_output.c:1953)
[ 46.316504] rawv6_sendmsg (net/ipv6/raw.c:584)
[ 46.316504] ? netfs_clear_subrequests (include/linux/list.h:373)
[ 46.316504] ? netfs_alloc_request (fs/netfs/objects.c:42)
[ 46.316504] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206)
[ 46.316504] ? set_pte_range (mm/memory.c:4529)
[ 46.316504] ? next_uptodate_folio (include/linux/xarray.h:1699)
[ 46.316504] ? __sock_sendmsg (net/socket.c:733)
[ 46.316504] __sock_sendmsg (net/socket.c:733)
[ 46.316504] ? move_addr_to_kernel.part.0 (net/socket.c:253)
[ 46.316504] __sys_sendto (net/socket.c:2191)
[ 46.316504] ? ktime_get_real_ts64 (kernel/time/timekeeping.c:292 (discriminator 3))
[ 46.316504] __x64_sys_sendto (net/socket.c:2203)
[ 46.316504] do_syscall_64 (arch/x86/entry/common.c:52)
[ 46.316504] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
[ 46.316504] RIP: 0033:0x7f6e557a8a0a
[ 46.316504] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
All code
========
0: d8 64 89 02 fsubs 0x2(%rcx,%rcx,4)
4: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
b: eb b8 jmp 0xffffffffffffffc5
d: 0f 1f 00 nopl (%rax)
10: f3 0f 1e fa endbr64
14: 41 89 ca mov %ecx,%r10d
17: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
1e: 00
1f: 85 c0 test %eax,%eax
21: 75 15 jne 0x38
23: b8 2c 00 00 00 mov $0x2c,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 7e ja 0xb0
32: c3 ret
33: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
38: 41 54 push %r12
3a: 48 83 ec 30 sub $0x30,%rsp
3e: 44 rex.R
3f: 89 .byte 0x89
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 7e ja 0x86
8: c3 ret
9: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
e: 41 54 push %r12
10: 48 83 ec 30 sub $0x30,%rsp
14: 44 rex.R
15: 89 .byte 0x89
[ 46.316504] RSP: 002b:00007ffd7aa1f0b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 46.316504] RAX: ffffffffffffffda RBX: 00007ffd7aa207f0 RCX: 00007f6e557a8a0a
[ 46.316504] RDX: 0000000000000040 RSI: 00005619effd8300 RDI: 0000000000000003
[ 46.316504] RBP: 00005619effd8300 R08: 00007ffd7aa22a04 R09: 000000000000001c
[ 46.316504] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd7aa20380
[ 46.316504] R13: 0000000000000040 R14: 00005619effda4f4 R15: 00007ffd7aa207f0
[ 46.316504] </TASK>
[ 46.316504] Modules linked in: mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[ 46.316504] ---[ end trace 0000000000000000 ]---
[ 46.316504] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 46.316504] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 d7 9c 31
All code
========
0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
7: 00
8: 0f 1f 40 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 55 push %rbp
12: 48 89 fd mov %rdi,%rbp
15: 48 83 ec 20 sub $0x20,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 18 mov %rax,0x18(%rsp)
27: 31 c0 xor %eax,%eax
29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
2e: 66 90 xchg %ax,%ax
30: 66 90 xchg %ax,%ax
32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
37: 48 89 ef mov %rbp,%rdi
3a: 65 gs
3b: 8b .byte 0x8b
3c: 35 .byte 0x35
3d: d7 xlat %ds:(%rbx)
3e: 9c pushf
3f: 31 .byte 0x31
Code starting with the faulting instruction
===========================================
0: c9 leave
1: 00 00 add %al,(%rax)
3: 00 66 90 add %ah,-0x70(%rsi)
6: 66 90 xchg %ax,%ax
8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
d: 48 89 ef mov %rbp,%rdi
10: 65 gs
11: 8b .byte 0x8b
12: 35 .byte 0x35
13: d7 xlat %ds:(%rbx)
14: 9c pushf
15: 31 .byte 0x31
[ 46.316504] RSP: 0018:ffffb96ac0003af8 EFLAGS: 00000246
[ 46.316504] RAX: 0000000000000000 RBX: ffff9d8088424000 RCX: 0000000000000000
[ 46.316504] RDX: 000000000000000a RSI: ffff9d8088426000 RDI: ffff9d8081b1f400
[ 46.316504] RBP: ffff9d8081b1f400 R08: 0000000000000000 R09: 0000000000000000
[ 46.316504] R10: ffff9d8082338000 R11: 736f6d6570736575 R12: ffff9d8081b1f400
[ 46.316504] R13: 0000000000000076 R14: 0000000000000000 R15: ffff9d8088422000
[ 46.316504] FS: 00007f6e554d61c0(0000) GS:ffff9d80fdc00000(0000) knlGS:0000000000000000
[ 46.316504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 46.316504] CR2: 00005619ee27d240 CR3: 000000000548a000 CR4: 00000000000006f0
[ 46.316504] Kernel panic - not syncing: Fatal exception in interrupt
[ 46.316504] Kernel Offset: 0x32400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Unexpected stop of the VM
https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7545349968/job/20540751697 export/20240116T172013 2572fed72ac38ba0b431c42d3ed3d95f9ccea066
Note that Eric suggests this is probably an issue on x86's side.
We had another stack trace:
# INFO: validating network environment with pings
[ 46.565607] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 46.565607] CPU: 2 PID: 1079 Comm: ping Tainted: G N 6.7.0-g1fd81af266b7 #1
[ 46.565607] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 46.565607] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 46.565607] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 d7 9c d1
All code
========
0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
7: 00
8: 0f 1f 40 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 55 push %rbp
12: 48 89 fd mov %rdi,%rbp
15: 48 83 ec 20 sub $0x20,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 18 mov %rax,0x18(%rsp)
27: 31 c0 xor %eax,%eax
29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
2e: 66 90 xchg %ax,%ax
30: 66 90 xchg %ax,%ax
32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
37: 48 89 ef mov %rbp,%rdi
3a: 65 gs
3b: 8b .byte 0x8b
3c: 35 .byte 0x35
3d: d7 xlat %ds:(%rbx)
3e: 9c pushf
3f: d1 .byte 0xd1
Code starting with the faulting instruction
===========================================
0: c9 leave
1: 00 00 add %al,(%rax)
3: 00 66 90 add %ah,-0x70(%rsi)
6: 66 90 xchg %ax,%ax
8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
d: 48 89 ef mov %rbp,%rdi
10: 65 gs
11: 8b .byte 0x8b
12: 35 .byte 0x35
13: d7 xlat %ds:(%rbx)
14: 9c pushf
15: d1 .byte 0xd1
[ 46.565607] RSP: 0018:ffff9ffc0011cc08 EFLAGS: 00000246
[ 46.565607] RAX: 0000000000000000 RBX: ffff9b8983696000 RCX: 0000000000000001
[ 46.565607] RDX: 0000000000000002 RSI: ffff9b8983697000 RDI: ffff9b89821cd600
[ 46.565607] RBP: ffff9b89821cd600 R08: 0000000000000000 R09: 000000000000001c
[ 46.565607] R10: ffff9b8983a05910 R11: ffff9b8983a05900 R12: ffff9b89821cd600
[ 46.565607] R13: 000000000000002a R14: 0000000000000000 R15: ffff9b8983695000
[ 46.565607] FS: 00007f5bd2bf61c0(0000) GS:ffff9b89fdd00000(0000) knlGS:0000000000000000
[ 46.565607] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 46.565607] CR2: 00007ffd2b36eff8 CR3: 00000000019b0000 CR4: 00000000000006f0
[ 46.565607] Call Trace:
[ 46.565607] <IRQ>
[ 46.565607] ? die (arch/x86/kernel/dumpstack.c:421)
[ 46.565607] ? exc_int3 (arch/x86/kernel/traps.c:762)
[ 46.565607] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569)
[ 46.565607] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 46.565607] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 46.565607] __netif_rx (net/core/dev.c:5084)
[ 46.565607] veth_xmit (drivers/net/veth.c:321)
[ 46.565607] dev_hard_start_xmit (include/linux/netdevice.h:4989)
[ 46.565607] __dev_queue_xmit (include/linux/netdevice.h:3367)
[ 46.565607] ? arp_send_dst (net/ipv4/arp.c:314)
[ 46.565607] arp_solicit (net/ipv4/arp.c:392)
[ 46.565607] ? kmem_cache_alloc (mm/slub.c:3843)
[ 46.565607] ? arp_constructor (net/ipv4/arp.c:249)
[ 46.565607] neigh_probe (arch/x86/include/asm/atomic.h:53)
[ 46.565607] __neigh_event_send (net/core/neighbour.c:1242)
[ 46.565607] neigh_resolve_output (net/core/neighbour.c:1547)
[ 46.565607] ip_finish_output2 (include/net/neighbour.h:542)
[ 46.565607] ? __ip_finish_output.part.0 (include/linux/skbuff.h:4884)
[ 46.565607] __netif_receive_skb_one_core (net/core/dev.c:5537)
[ 46.565607] process_backlog (include/linux/rcupdate.h:779)
[ 46.565607] __napi_poll (net/core/dev.c:6576)
[ 46.565607] net_rx_action (net/core/dev.c:6647)
[ 46.565607] __do_softirq (arch/x86/include/asm/jump_label.h:27)
[ 46.565607] do_softirq (kernel/softirq.c:454)
[ 46.565607] </IRQ>
[ 46.565607] <TASK>
[ 46.565607] __local_bh_enable_ip (kernel/softirq.c:381)
[ 46.565607] __dev_queue_xmit (net/core/dev.c:4379)
[ 46.565607] ip_finish_output2 (include/linux/netdevice.h:3171)
[ 46.565607] ? __ip_finish_output.part.0 (include/linux/skbuff.h:4884)
[ 46.565607] ip_push_pending_frames (net/ipv4/ip_output.c:1490)
[ 46.565607] raw_sendmsg (net/ipv4/raw.c:647)
[ 46.565607] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206)
[ 46.565607] ? set_pte_range (mm/memory.c:4529)
[ 46.565607] ? update_load_avg (kernel/sched/fair.c:4405)
[ 46.565607] ? __sock_sendmsg (net/socket.c:733)
[ 46.565607] __sock_sendmsg (net/socket.c:733)
[ 46.565607] ? move_addr_to_kernel.part.0 (net/socket.c:253)
[ 46.565607] __sys_sendto (net/socket.c:2191)
[ 46.565607] ? __do_softirq (arch/x86/include/asm/jump_label.h:27)
[ 46.565607] __x64_sys_sendto (net/socket.c:2203)
[ 46.565607] do_syscall_64 (arch/x86/entry/common.c:52)
[ 46.565607] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
[ 46.565607] RIP: 0033:0x7f5bd2ec8a0a
[ 46.565607] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
All code
========
0: d8 64 89 02 fsubs 0x2(%rcx,%rcx,4)
4: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
b: eb b8 jmp 0xffffffffffffffc5
d: 0f 1f 00 nopl (%rax)
10: f3 0f 1e fa endbr64
14: 41 89 ca mov %ecx,%r10d
17: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
1e: 00
1f: 85 c0 test %eax,%eax
21: 75 15 jne 0x38
23: b8 2c 00 00 00 mov $0x2c,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 7e ja 0xb0
32: c3 ret
33: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
38: 41 54 push %r12
3a: 48 83 ec 30 sub $0x30,%rsp
3e: 44 rex.R
3f: 89 .byte 0x89
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 7e ja 0x86
8: c3 ret
9: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
e: 41 54 push %r12
10: 48 83 ec 30 sub $0x30,%rsp
14: 44 rex.R
15: 89 .byte 0x89
[ 46.565607] RSP: 002b:00007ffd2b36eff8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 46.565607] RAX: ffffffffffffffda RBX: 00007ffd2b3706a0 RCX: 00007f5bd2ec8a0a
[ 46.565607] RDX: 0000000000000040 RSI: 00005645ab399300 RDI: 0000000000000003
[ 46.565607] RBP: 00005645ab399300 R08: 00007ffd2b372920 R09: 0000000000000010
[ 46.565607] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
[ 46.565607] R13: 00007ffd2b370238 R14: 00007ffd2b36f000 R15: 00007ffd2b3706a0
[ 46.565607] </TASK>
[ 46.565607] Modules linked in: mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[ 46.565607] ---[ end trace 0000000000000000 ]---
[ 46.565607] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 46.565607] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 d7 9c d1
All code
========
0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
7: 00
8: 0f 1f 40 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 55 push %rbp
12: 48 89 fd mov %rdi,%rbp
15: 48 83 ec 20 sub $0x20,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 18 mov %rax,0x18(%rsp)
27: 31 c0 xor %eax,%eax
29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
2e: 66 90 xchg %ax,%ax
30: 66 90 xchg %ax,%ax
32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
37: 48 89 ef mov %rbp,%rdi
3a: 65 gs
3b: 8b .byte 0x8b
3c: 35 .byte 0x35
3d: d7 xlat %ds:(%rbx)
3e: 9c pushf
3f: d1 .byte 0xd1
Code starting with the faulting instruction
===========================================
0: c9 leave
1: 00 00 add %al,(%rax)
3: 00 66 90 add %ah,-0x70(%rsi)
6: 66 90 xchg %ax,%ax
8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
d: 48 89 ef mov %rbp,%rdi
10: 65 gs
11: 8b .byte 0x8b
12: 35 .byte 0x35
13: d7 xlat %ds:(%rbx)
14: 9c pushf
15: d1 .byte 0xd1
[ 46.565607] RSP: 0018:ffff9ffc0011cc08 EFLAGS: 00000246
[ 46.565607] RAX: 0000000000000000 RBX: ffff9b8983696000 RCX: 0000000000000001
[ 46.565607] RDX: 0000000000000002 RSI: ffff9b8983697000 RDI: ffff9b89821cd600
[ 46.565607] RBP: ffff9b89821cd600 R08: 0000000000000000 R09: 000000000000001c
[ 46.565607] R10: ffff9b8983a05910 R11: ffff9b8983a05900 R12: ffff9b89821cd600
[ 46.565607] R13: 000000000000002a R14: 0000000000000000 R15: ffff9b8983695000
[ 46.565607] FS: 00007f5bd2bf61c0(0000) GS:ffff9b89fdd00000(0000) knlGS:0000000000000000
[ 46.565607] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 46.565607] CR2: 00007ffd2b36eff8 CR3: 00000000019b0000 CR4: 00000000000006f0
[ 46.565607] Kernel panic - not syncing: Fatal exception in interrupt
[ 46.565607] Kernel Offset: 0x39a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Unexpected stop of the VM
https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7550415246/job/20556002186 On top of export. patchew/cover.1705331716.git.pabeni@redhat.com
Because it is impacting us with the CI, I suggest to reopen it for the moment.
I managed to reproduce it manually by:
--privileged
mode → QEmu is then using tcg
instead.echo "run_loop_n 150 run_selftest_one mptcp_connect.sh" > .virtme-exec-run
Stopping the test after the ping using this patch:
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index 7898d62fce0b..52320cb95d31 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -852,6 +852,7 @@ done
mptcp_lib_result_code "${ret}" "ping tests"
stop_if_error "Could not even run ping tests"
+exit ${final_ret}
[ -n "$tc_loss" ] && tc -net "$ns2" qdisc add dev ns2eth3 root netem loss random $tc_loss delay ${tc_delay}ms
echo -n "INFO: Using loss of $tc_loss "
Sometimes, it is "quick" (~10 attempts), but sometimes it takes more than 100 attempts.
I started to do a Git bisect, but I can still reproduce it on a v6.4 kernel for example.
The Cirrus CI (KVM) never complained about that, so maybe an issue with TCG that is used instead of KVM? Maybe an issue with QEmu? I tried to upgrade it to the v8 (currently on the v6.2), but virtme sets QEmu options that are no longer supported...
After a few long git bisect
sessions, I managed to find a commit. If I revert this commit on top of our export
branch, I can no longer reproduce the issue. Or at least, not after ~2000 iterations. Most of the time, I hit the panic after less than 50 iterations. I saw a few times that it was taking more than 100 iterations, up to 140. During the blame sessions, I ended up doing 200 iterations before marking the commit as good. So I guess 2000 iterations are enough to confirm this commit does something.
Now... surprisingly, this commit is 8e791f7eba4c ("x86/kprobes: Drop removed INT3 handling code"): a modification in arch/x86/kernel/kprobes/core.c
. I'm not sure to see the link.
I guess the best is to report this to the author of the patch.
# bad: [457391b0380335d5e9a5babdec90ac53928b23b4] Linux 6.3
# good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
git bisect start 'v6.3' 'v6.2'
# bad: [056612fd41fef88eef22a032021cc15ef98cfc34] Merge tag 'x86-cleanups-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 056612fd41fef88eef22a032021cc15ef98cfc34
# bad: [3f0b0903fde584a7398f82fc00bf4f8138610b87] Merge tag 'x86_vdso_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 3f0b0903fde584a7398f82fc00bf4f8138610b87
# good: [7dbdc16fc85bcd89a2f3698df37a7202ea266454] Merge tag 'qcom-arm64-for-6.3-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/dt
git bisect good 7dbdc16fc85bcd89a2f3698df37a7202ea266454
# good: [5b0ed5964928b0aaf0d644c17c886c7f5ea4bb3f] Merge tag 'for-6.3/block-2023-02-16' of git://git.kernel.dk/linux
git bisect good 5b0ed5964928b0aaf0d644c17c886c7f5ea4bb3f
# good: [6e649d08568220ee88deef0a1ad8b3a935420cf2] Merge tag 'locking-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 6e649d08568220ee88deef0a1ad8b3a935420cf2
# good: [7c4a5b89a0b5a57a64b601775b296abf77a9fe97] sched/rt: pick_next_rt_entity(): check list_entry
git bisect good 7c4a5b89a0b5a57a64b601775b296abf77a9fe97
# bad: [0246725d7399d7d6acc8fd5a1a0a1ffce9a1eaa3] Merge tag 'ras_core_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 0246725d7399d7d6acc8fd5a1a0a1ffce9a1eaa3
# bad: [fd636b6a9bc6034f2e5bb869658898a2b472c037] x86/perf/zhaoxin: Add stepping check for ZXC
git bisect bad fd636b6a9bc6034f2e5bb869658898a2b472c037
# bad: [4cf7a136115e96241f9f1089d2b53c47accf3823] perf/core: Save the dynamic parts of sample data size
git bisect bad 4cf7a136115e96241f9f1089d2b53c47accf3823
# bad: [a018d2e3d4b1abc4a3cb64415c5d204fc5d2eafd] x86/cpufeatures: Add Architectural PerfMon Extension bit
git bisect bad a018d2e3d4b1abc4a3cb64415c5d204fc5d2eafd
# bad: [b6c00fb9949fbd073e651a77aa75faca978cf2a6] perf: Add PMU_FORMAT_ATTR_SHOW
git bisect bad b6c00fb9949fbd073e651a77aa75faca978cf2a6
# bad: [8e791f7eba4c7711f56616ae163ee3cbc00b1bf4] x86/kprobes: Drop removed INT3 handling code
git bisect bad 8e791f7eba4c7711f56616ae163ee3cbc00b1bf4
# good: [03c4c7f88709fac0e20b6a48357c73d6fc50e544] perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers
git bisect good 03c4c7f88709fac0e20b6a48357c73d6fc50e544
# first bad commit: [8e791f7eba4c7711f56616ae163ee3cbc00b1bf4] x86/kprobes: Drop removed INT3 handling code
Note that I just managed to reproduce it on top of the export branch (export/20240119T055335), after having done a ping in IPv4 this time:
# INFO: set ns4-65abe8a3-O2ZCgd dev ns4eth3: ethtool -K tso off
# Created /tmp/tmp.BWY7Jw45jg (size 1924224 /tmp/tmp.BWY7Jw45jg) containing data sent by client
# Created /tmp/tmp.19cAx2Eg8O (size 2428289 /tmp/tmp.19cAx2Eg8O) containing data sent by server
# New MPTCP socket can be blocked via sysctl [ OK ]
# INFO: validating network environment with pings
[ 1985.073189] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 1985.073246] CPU: 0 PID: 3203 Comm: ping Not tainted 6.7.0-113761-g5e006770879c-dirty #250
[ 1985.073246] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 1985.073246] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 67 48 d0
All code
========
0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
7: 00
8: 0f 1f 40 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 55 push %rbp
12: 48 89 fd mov %rdi,%rbp
15: 48 83 ec 20 sub $0x20,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 18 mov %rax,0x18(%rsp)
27: 31 c0 xor %eax,%eax
29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
2e: 66 90 xchg %ax,%ax
30: 66 90 xchg %ax,%ax
32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
37: 48 89 ef mov %rbp,%rdi
3a: 65 gs
3b: 8b .byte 0x8b
3c: 35 .byte 0x35
3d: 67 addr32
3e: 48 rex.W
3f: d0 .byte 0xd0
Code starting with the faulting instruction
===========================================
0: c9 leave
1: 00 00 add %al,(%rax)
3: 00 66 90 add %ah,-0x70(%rsi)
6: 66 90 xchg %ax,%ax
8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
d: 48 89 ef mov %rbp,%rdi
10: 65 gs
11: 8b .byte 0x8b
12: 35 .byte 0x35
13: 67 addr32
14: 48 rex.W
15: d0 .byte 0xd0
[ 1985.073246] RSP: 0018:ffffb36d40003c08 EFLAGS: 00000246
[ 1985.073246] RAX: 0000000000000000 RBX: ffff9580825ca000 RCX: 0000000000000001
[ 1985.073246] RDX: 0000000000000002 RSI: ffff9580825c8000 RDI: ffff9580821cca00
[ 1985.073246] RBP: ffff9580821cca00 R08: 0000000000000000 R09: 000000000000001c
[ 1985.073246] R10: ffff9580812dcf10 R11: ffff9580812dcf00 R12: ffff9580821cca00
[ 1985.073246] R13: 000000000000002a R14: 0000000000000000 R15: ffff9580825e1800
[ 1985.073246] FS: 00007fa7c46be1c0(0000) GS:ffff9580fdc00000(0000) knlGS:0000000000000000
[ 1985.073246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1985.073246] CR2: 00005584236b2200 CR3: 0000000002704000 CR4: 00000000000006f0
[ 1985.073246] Call Trace:
[ 1985.073246] <IRQ>
[ 1985.073246] ? die (arch/x86/kernel/dumpstack.c:421)
[ 1985.073246] ? exc_int3 (arch/x86/kernel/traps.c:762)
[ 1985.073246] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569)
[ 1985.073246] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] ? kmem_cache_alloc_node (mm/slub.c:3843)
[ 1985.073246] __netif_rx (net/core/dev.c:5084)
[ 1985.073246] veth_xmit (drivers/net/veth.c:321)
[ 1985.073246] dev_hard_start_xmit (include/linux/netdevice.h:4989)
[ 1985.073246] __dev_queue_xmit (include/linux/netdevice.h:3367)
[ 1985.073246] ? arp_create (net/ipv4/arp.c:577)
[ 1985.073246] arp_solicit (net/ipv4/arp.c:392)
[ 1985.073246] ? kmem_cache_alloc (mm/slub.c:3843)
[ 1985.073246] ? arp_constructor (net/ipv4/arp.c:249)
[ 1985.073246] neigh_probe (arch/x86/include/asm/atomic.h:53)
[ 1985.073246] __neigh_event_send (net/core/neighbour.c:1242)
[ 1985.073246] neigh_resolve_output (net/core/neighbour.c:1547)
[ 1985.073246] ip_finish_output2 (include/net/neighbour.h:542)
[ 1985.073246] ? __ip_finish_output.part.0 (include/linux/skbuff.h:4884)
[ 1985.073246] __netif_receive_skb_one_core (net/core/dev.c:5537)
[ 1985.073246] process_backlog (include/linux/rcupdate.h:782)
[ 1985.073246] __napi_poll (net/core/dev.c:6576)
[ 1985.073246] net_rx_action (net/core/dev.c:6647)
[ 1985.073246] __do_softirq (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] do_softirq (kernel/softirq.c:454)
[ 1985.073246] </IRQ>
[ 1985.073246] <TASK>
[ 1985.073246] __local_bh_enable_ip (kernel/softirq.c:381)
[ 1985.073246] __dev_queue_xmit (net/core/dev.c:4379)
[ 1985.073246] ip_finish_output2 (include/linux/netdevice.h:3171)
[ 1985.073246] ? __ip_finish_output.part.0 (include/linux/skbuff.h:4884)
[ 1985.073246] ip_push_pending_frames (net/ipv4/ip_output.c:1490)
[ 1985.073246] raw_sendmsg (net/ipv4/raw.c:647)
[ 1985.073246] ? netfs_rreq_assess (fs/netfs/io.c:101)
[ 1985.073246] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206)
[ 1985.073246] ? set_pte_range (mm/memory.c:4529)
[ 1985.073246] ? __sock_sendmsg (net/socket.c:733)
[ 1985.073246] __sock_sendmsg (net/socket.c:733)
[ 1985.073246] ? move_addr_to_kernel.part.0 (net/socket.c:253)
[ 1985.073246] __sys_sendto (net/socket.c:2191)
[ 1985.073246] ? __rseq_handle_notify_resume (kernel/rseq.c:257)
[ 1985.073246] ? ktime_get_real_ts64 (kernel/time/timekeeping.c:292 (discriminator 3))
[ 1985.073246] __x64_sys_sendto (net/socket.c:2203)
[ 1985.073246] do_syscall_64 (arch/x86/entry/common.c:52)
[ 1985.073246] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
[ 1985.073246] RIP: 0033:0x7fa7c499081a
[ 1985.073246] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
All code
========
0: d8 64 89 02 fsubs 0x2(%rcx,%rcx,4)
4: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
b: eb b8 jmp 0xffffffffffffffc5
d: 0f 1f 00 nopl (%rax)
10: f3 0f 1e fa endbr64
14: 41 89 ca mov %ecx,%r10d
17: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
1e: 00
1f: 85 c0 test %eax,%eax
21: 75 15 jne 0x38
23: b8 2c 00 00 00 mov $0x2c,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 7e ja 0xb0
32: c3 ret
33: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
38: 41 54 push %r12
3a: 48 83 ec 30 sub $0x30,%rsp
3e: 44 rex.R
3f: 89 .byte 0x89
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 7e ja 0x86
8: c3 ret
9: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
e: 41 54 push %r12
10: 48 83 ec 30 sub $0x30,%rsp
14: 44 rex.R
15: 89 .byte 0x89
[ 1985.073246] RSP: 002b:00007ffce269b368 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 1985.073246] RAX: ffffffffffffffda RBX: 00007ffce269ca10 RCX: 00007fa7c499081a
[ 1985.073246] RDX: 0000000000000040 RSI: 0000558423f7c300 RDI: 0000000000000003
[ 1985.073246] RBP: 0000558423f7c300 R08: 00007ffce269ec90 R09: 0000000000000010
[ 1985.073246] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
[ 1985.073246] R13: 00007ffce269c5a8 R14: 00007ffce269b370 R15: 00007ffce269ca10
[ 1985.073246] </TASK>
[ 1985.073246] Modules linked in:
[ 1985.073246] ---[ end trace 0000000000000000 ]---
[ 1985.073246] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 67 48 d0
All code
========
0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
7: 00
8: 0f 1f 40 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 55 push %rbp
12: 48 89 fd mov %rdi,%rbp
15: 48 83 ec 20 sub $0x20,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 18 mov %rax,0x18(%rsp)
27: 31 c0 xor %eax,%eax
29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
2e: 66 90 xchg %ax,%ax
30: 66 90 xchg %ax,%ax
32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
37: 48 89 ef mov %rbp,%rdi
3a: 65 gs
3b: 8b .byte 0x8b
3c: 35 .byte 0x35
3d: 67 addr32
3e: 48 rex.W
3f: d0 .byte 0xd0
Code starting with the faulting instruction
===========================================
0: c9 leave
1: 00 00 add %al,(%rax)
3: 00 66 90 add %ah,-0x70(%rsi)
6: 66 90 xchg %ax,%ax
8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
d: 48 89 ef mov %rbp,%rdi
10: 65 gs
11: 8b .byte 0x8b
12: 35 .byte 0x35
13: 67 addr32
14: 48 rex.W
15: d0 .byte 0xd0
[ 1985.073246] RSP: 0018:ffffb36d40003c08 EFLAGS: 00000246
[ 1985.073246] RAX: 0000000000000000 RBX: ffff9580825ca000 RCX: 0000000000000001
[ 1985.073246] RDX: 0000000000000002 RSI: ffff9580825c8000 RDI: ffff9580821cca00
[ 1985.073246] RBP: ffff9580821cca00 R08: 0000000000000000 R09: 000000000000001c
[ 1985.073246] R10: ffff9580812dcf10 R11: ffff9580812dcf00 R12: ffff9580821cca00
[ 1985.073246] R13: 000000000000002a R14: 0000000000000000 R15: ffff9580825e1800
[ 1985.073246] FS: 00007fa7c46be1c0(0000) GS:ffff9580fdc00000(0000) knlGS:0000000000000000
[ 1985.073246] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1985.073246] CR2: 00005584236b2200 CR3: 0000000002704000 CR4: 00000000000006f0
[ 1985.073246] Kernel panic - not syncing: Fatal exception in interrupt
[ 1985.073246] Kernel Offset: 0x19a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Unexpected stop of the VM
What was being done in userspace:
++ dirname ./mptcp_connect.sh
+ . ./mptcp_lib.sh
++ readonly KSFT_PASS=0
++ KSFT_PASS=0
++ readonly KSFT_FAIL=1
++ KSFT_FAIL=1
++ readonly KSFT_SKIP=4
++ KSFT_SKIP=4
+++ basename ./mptcp_connect.sh
+++ sed 's/\.sh$//g'
++ readonly KSFT_TEST=mptcp_connect
++ KSFT_TEST=mptcp_connect
++ MPTCP_LIB_SUBTESTS=()
++ '[' -t 1 ']'
++ '[' 1 = 1 ']'
++ '[' '' '!=' 1 ']'
++ readonly 'MPTCP_LIB_COLOR_RED=\E[1;31m'
++ MPTCP_LIB_COLOR_RED='\E[1;31m'
++ readonly 'MPTCP_LIB_COLOR_GREEN=\E[1;32m'
++ MPTCP_LIB_COLOR_GREEN='\E[1;32m'
++ readonly 'MPTCP_LIB_COLOR_YELLOW=\E[1;33m'
++ MPTCP_LIB_COLOR_YELLOW='\E[1;33m'
++ readonly 'MPTCP_LIB_COLOR_BLUE=\E[1;34m'
++ MPTCP_LIB_COLOR_BLUE='\E[1;34m'
++ readonly 'MPTCP_LIB_COLOR_RESET=\E[0m'
++ MPTCP_LIB_COLOR_RESET='\E[0m'
++ date +%s
+ time_start=1705765027
+ optstring=S:R:d:e:l:r:h4cm:f:tC
+ ret=0
+ final_ret=0
+ sin=
+ sout=
+ cin_disconnect=
+ cin=
+ cout=
+ ksft_skip=4
+ capture=false
+ timeout_poll=30
+ timeout_test=61
+ ipv6=true
+ ethtool_random_on=true
+ tc_delay=49
+ tc_loss=76
+ testmode=
+ sndbuf=0
+ rcvbuf=0
+ options_log=true
+ do_tcp=0
+ checksum=false
+ filesize=0
+ connect_per_transfer=1
+ '[' 76 -eq 100 ']'
+ '[' 76 -ge 10 ']'
+ tc_loss=0.76%
+ getopts S:R:d:e:l:r:h4cm:f:tC option
++ date +%s
+ sec=1705765027
++ printf %x 1705765027
++ mktemp -u XXXXXX
+ rndh=65abe8a3-O2ZCgd
+ ns1=ns1-65abe8a3-O2ZCgd
+ ns2=ns2-65abe8a3-O2ZCgd
+ ns3=ns3-65abe8a3-O2ZCgd
+ ns4=ns4-65abe8a3-O2ZCgd
+ TEST_COUNT=0
+ TEST_GROUP=
+ mptcp_lib_check_mptcp
+ mptcp_lib_has_file /proc/sys/net/mptcp/enabled
+ local f=/proc/sys/net/mptcp/enabled
+ '[' -f /proc/sys/net/mptcp/enabled ']'
+ return 0
+ mptcp_lib_check_kallsyms
+ mptcp_lib_has_file /proc/kallsyms
+ local f=/proc/kallsyms
+ '[' -f /proc/kallsyms ']'
+ return 0
+ ip -Version
+ '[' 0 -ne 0 ']'
++ mktemp
+ sin=/tmp/tmp.19cAx2Eg8O
++ mktemp
+ sout=/tmp/tmp.D74uSh5z4f
++ mktemp
+ cin=/tmp/tmp.BWY7Jw45jg
++ mktemp
+ cout=/tmp/tmp.YGp9ybpEow
++ mktemp
+ capout=/tmp/tmp.FIKQQYHbaC
+ cin_disconnect=/tmp/tmp.BWY7Jw45jg.disconnect
+ cout_disconnect=/tmp/tmp.YGp9ybpEow.disconnect
+ trap cleanup EXIT
+ for i in "$ns1" "$ns2" "$ns3" "$ns4"
+ ip netns add ns1-65abe8a3-O2ZCgd
+ ip -net ns1-65abe8a3-O2ZCgd link set lo up
+ for i in "$ns1" "$ns2" "$ns3" "$ns4"
+ ip netns add ns2-65abe8a3-O2ZCgd
+ ip -net ns2-65abe8a3-O2ZCgd link set lo up
+ for i in "$ns1" "$ns2" "$ns3" "$ns4"
+ ip netns add ns3-65abe8a3-O2ZCgd
+ ip -net ns3-65abe8a3-O2ZCgd link set lo up
+ for i in "$ns1" "$ns2" "$ns3" "$ns4"
+ ip netns add ns4-65abe8a3-O2ZCgd
+ ip -net ns4-65abe8a3-O2ZCgd link set lo up
+ ip link add ns1eth2 netns ns1-65abe8a3-O2ZCgd type veth peer name ns2eth1 netns ns2-65abe8a3-O2ZCgd
+ ip link add ns2eth3 netns ns2-65abe8a3-O2ZCgd type veth peer name ns3eth2 netns ns3-65abe8a3-O2ZCgd
+ ip link add ns3eth4 netns ns3-65abe8a3-O2ZCgd type veth peer name ns4eth3 netns ns4-65abe8a3-O2ZCgd
+ ip -net ns1-65abe8a3-O2ZCgd addr add 10.0.1.1/24 dev ns1eth2
+ ip -net ns1-65abe8a3-O2ZCgd addr add dead:beef:1::1/64 dev ns1eth2 nodad
+ ip -net ns1-65abe8a3-O2ZCgd link set ns1eth2 up
+ ip -net ns1-65abe8a3-O2ZCgd route add default via 10.0.1.2
+ ip -net ns1-65abe8a3-O2ZCgd route add default via dead:beef:1::2
+ ip -net ns2-65abe8a3-O2ZCgd addr add 10.0.1.2/24 dev ns2eth1
+ ip -net ns2-65abe8a3-O2ZCgd addr add dead:beef:1::2/64 dev ns2eth1 nodad
+ ip -net ns2-65abe8a3-O2ZCgd link set ns2eth1 up
+ ip -net ns2-65abe8a3-O2ZCgd addr add 10.0.2.1/24 dev ns2eth3
+ ip -net ns2-65abe8a3-O2ZCgd addr add dead:beef:2::1/64 dev ns2eth3 nodad
+ ip -net ns2-65abe8a3-O2ZCgd link set ns2eth3 up
+ ip -net ns2-65abe8a3-O2ZCgd route add default via 10.0.2.2
+ ip -net ns2-65abe8a3-O2ZCgd route add default via dead:beef:2::2
+ ip netns exec ns2-65abe8a3-O2ZCgd sysctl -q net.ipv4.ip_forward=1
+ ip netns exec ns2-65abe8a3-O2ZCgd sysctl -q net.ipv6.conf.all.forwarding=1
+ ip -net ns3-65abe8a3-O2ZCgd addr add 10.0.2.2/24 dev ns3eth2
+ ip -net ns3-65abe8a3-O2ZCgd addr add dead:beef:2::2/64 dev ns3eth2 nodad
+ ip -net ns3-65abe8a3-O2ZCgd link set ns3eth2 up
+ ip -net ns3-65abe8a3-O2ZCgd addr add 10.0.3.2/24 dev ns3eth4
+ ip -net ns3-65abe8a3-O2ZCgd addr add dead:beef:3::2/64 dev ns3eth4 nodad
+ ip -net ns3-65abe8a3-O2ZCgd link set ns3eth4 up
+ ip -net ns3-65abe8a3-O2ZCgd route add default via 10.0.2.1
+ ip -net ns3-65abe8a3-O2ZCgd route add default via dead:beef:2::1
+ ip netns exec ns3-65abe8a3-O2ZCgd sysctl -q net.ipv4.ip_forward=1
+ ip netns exec ns3-65abe8a3-O2ZCgd sysctl -q net.ipv6.conf.all.forwarding=1
+ ip -net ns4-65abe8a3-O2ZCgd addr add 10.0.3.1/24 dev ns4eth3
+ ip -net ns4-65abe8a3-O2ZCgd addr add dead:beef:3::1/64 dev ns4eth3 nodad
+ ip -net ns4-65abe8a3-O2ZCgd link set ns4eth3 up
+ ip -net ns4-65abe8a3-O2ZCgd route add default via 10.0.3.2
+ ip -net ns4-65abe8a3-O2ZCgd route add default via dead:beef:3::2
+ false
+ true
+ set_random_ethtool_flags ns3-65abe8a3-O2ZCgd ns3eth2
+ local flags=
+ local r=25968
+ local pick1=0
+ local pick2=0
+ local pick3=0
+ '[' 0 -ne 0 ']'
+ '[' 0 -ne 0 ']'
+ '[' 0 -ne 0 ']'
+ '[' -z '' ']'
+ return
+ set_random_ethtool_flags ns4-65abe8a3-O2ZCgd ns4eth3
+ local flags=
+ local r=32681
+ local pick1=1
+ local pick2=0
+ local pick3=0
+ '[' 1 -ne 0 ']'
+ flags='tso off'
+ '[' 0 -ne 0 ']'
+ '[' 0 -ne 0 ']'
+ '[' -z 'tso off' ']'
+ set_ethtool_flags ns4-65abe8a3-O2ZCgd ns4eth3 'tso off'
+ local ns=ns4-65abe8a3-O2ZCgd
+ local dev=ns4eth3
+ local 'flags=tso off'
+ ip netns exec ns4-65abe8a3-O2ZCgd ethtool -K ns4eth3 tso off
+ '[' 0 -eq 0 ']'
+ echo 'INFO: set ns4-65abe8a3-O2ZCgd dev ns4eth3: ethtool -K tso off'
+ make_file /tmp/tmp.BWY7Jw45jg client
+ local name=/tmp/tmp.BWY7Jw45jg
+ local who=client
+ local SIZE=0
+ local ksize
+ local rem
+ '[' 0 -eq 0 ']'
+ local MAXSIZE=8388608
+ local MINSIZE=262144
+ SIZE=1924196
+ ksize=1879
+ rem=100
+ mptcp_lib_make_file /tmp/tmp.BWY7Jw45jg 1024 1879
+ local name=/tmp/tmp.BWY7Jw45jg
+ local bs=1024
+ local size=1879
+ dd if=/dev/urandom of=/tmp/tmp.BWY7Jw45jg bs=1024 count=1879
+ echo -e '\nMPTCP_TEST_FILE_END_MARKER'
+ dd if=/dev/urandom conv=notrunc of=/tmp/tmp.BWY7Jw45jg oflag=append bs=1 count=100
++ du -b /tmp/tmp.BWY7Jw45jg
+ echo 'Created /tmp/tmp.BWY7Jw45jg (size 1924224 /tmp/tmp.BWY7Jw45jg) containing data sent by client'
+ make_file /tmp/tmp.19cAx2Eg8O server
+ local name=/tmp/tmp.19cAx2Eg8O
+ local who=server
+ local SIZE=0
+ local ksize
+ local rem
+ '[' 0 -eq 0 ']'
+ local MAXSIZE=8388608
+ local MINSIZE=262144
+ SIZE=2428261
+ ksize=2371
+ rem=357
+ mptcp_lib_make_file /tmp/tmp.19cAx2Eg8O 1024 2371
+ local name=/tmp/tmp.19cAx2Eg8O
+ local bs=1024
+ local size=2371
+ dd if=/dev/urandom of=/tmp/tmp.19cAx2Eg8O bs=1024 count=2371
+ echo -e '\nMPTCP_TEST_FILE_END_MARKER'
+ dd if=/dev/urandom conv=notrunc of=/tmp/tmp.19cAx2Eg8O oflag=append bs=1 count=357
++ du -b /tmp/tmp.19cAx2Eg8O
+ echo 'Created /tmp/tmp.19cAx2Eg8O (size 2428289 /tmp/tmp.19cAx2Eg8O) containing data sent by server'
+ check_mptcp_disabled
+ local disabled_ns=ns_disabled-65abe8a3-O2ZCgd
+ ip netns add ns_disabled-65abe8a3-O2ZCgd
++ ip netns exec ns_disabled-65abe8a3-O2ZCgd sysctl net.mptcp.enabled
++ awk '{ print $3 }'
+ '[' 1 -ne 1 ']'
+ ip netns exec ns_disabled-65abe8a3-O2ZCgd sysctl -q net.mptcp.enabled=0
+ local err=0
+ grep -q '^socket: Protocol not available$'
+ LC_ALL=C
+ ip netns exec ns_disabled-65abe8a3-O2ZCgd ./mptcp_connect -p 10000 -s MPTCP 127.0.0.1
+ err=1
+ ip netns delete ns_disabled-65abe8a3-O2ZCgd
+ '[' 1 -eq 0 ']'
+ echo -e 'New MPTCP socket can be blocked via sysctl\t\t[ OK ]'
+ mptcp_lib_result_pass 'New MPTCP socket can be blocked via sysctl'
+ __mptcp_lib_result_add ok 'New MPTCP socket can be blocked via sysctl'
+ local result=ok
+ shift
+ local id=1
+ MPTCP_LIB_SUBTESTS+=("${result} ${id} - ${KSFT_TEST}: ${*}")
+ return 0
+ stop_if_error 'The kernel configuration is not valid for MPTCP'
+ log_if_error 'The kernel configuration is not valid for MPTCP'
+ local 'msg=The kernel configuration is not valid for MPTCP'
+ '[' 0 -ne 0 ']'
+ echo 'INFO: validating network environment with pings'
+ for sender in "$ns1" "$ns2" "$ns3" "$ns4"
+ do_ping ns1-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.1.1
+ local listener_ns=ns1-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.1.1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.1.1
+ '[' -z 10.0.1.1 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.1.1
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns1-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:1::1
+ local listener_ns=ns1-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:1::1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:1::1
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:1::1
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns2-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.1.2
+ local listener_ns=ns2-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.1.2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.1.2
+ '[' -z 10.0.1.2 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.1.2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns2-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:1::2
+ local listener_ns=ns2-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:1::2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:1::2
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:1::2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns2-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.2.1
+ local listener_ns=ns2-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.2.1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.2.1
+ '[' -z 10.0.2.1 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.2.1
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns2-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:2::1
+ local listener_ns=ns2-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:2::1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:2::1
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:2::1
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns3-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.2.2
+ local listener_ns=ns3-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.2.2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.2.2
+ '[' -z 10.0.2.2 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.2.2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns3-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:2::2
+ local listener_ns=ns3-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:2::2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:2::2
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:2::2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns3-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.3.2
+ local listener_ns=ns3-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.3.2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.3.2
+ '[' -z 10.0.3.2 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.3.2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns3-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:3::2
+ local listener_ns=ns3-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:3::2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:3::2
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:3::2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns4-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.3.1
+ local listener_ns=ns4-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.3.1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.3.1
+ '[' -z 10.0.3.1 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.3.1
Here is the stripped version to ease the reading:
+ ip netns add ns1-65abe8a3-O2ZCgd
+ ip -net ns1-65abe8a3-O2ZCgd link set lo up
+ ip netns add ns2-65abe8a3-O2ZCgd
+ ip -net ns2-65abe8a3-O2ZCgd link set lo up
+ ip netns add ns3-65abe8a3-O2ZCgd
+ ip -net ns3-65abe8a3-O2ZCgd link set lo up
+ ip netns add ns4-65abe8a3-O2ZCgd
+ ip -net ns4-65abe8a3-O2ZCgd link set lo up
+ ip link add ns1eth2 netns ns1-65abe8a3-O2ZCgd type veth peer name ns2eth1 netns ns2-65abe8a3-O2ZCgd
+ ip link add ns2eth3 netns ns2-65abe8a3-O2ZCgd type veth peer name ns3eth2 netns ns3-65abe8a3-O2ZCgd
+ ip link add ns3eth4 netns ns3-65abe8a3-O2ZCgd type veth peer name ns4eth3 netns ns4-65abe8a3-O2ZCgd
+ ip -net ns1-65abe8a3-O2ZCgd addr add 10.0.1.1/24 dev ns1eth2
+ ip -net ns1-65abe8a3-O2ZCgd addr add dead:beef:1::1/64 dev ns1eth2 nodad
+ ip -net ns1-65abe8a3-O2ZCgd link set ns1eth2 up
+ ip -net ns1-65abe8a3-O2ZCgd route add default via 10.0.1.2
+ ip -net ns1-65abe8a3-O2ZCgd route add default via dead:beef:1::2
+ ip -net ns2-65abe8a3-O2ZCgd addr add 10.0.1.2/24 dev ns2eth1
+ ip -net ns2-65abe8a3-O2ZCgd addr add dead:beef:1::2/64 dev ns2eth1 nodad
+ ip -net ns2-65abe8a3-O2ZCgd link set ns2eth1 up
+ ip -net ns2-65abe8a3-O2ZCgd addr add 10.0.2.1/24 dev ns2eth3
+ ip -net ns2-65abe8a3-O2ZCgd addr add dead:beef:2::1/64 dev ns2eth3 nodad
+ ip -net ns2-65abe8a3-O2ZCgd link set ns2eth3 up
+ ip -net ns2-65abe8a3-O2ZCgd route add default via 10.0.2.2
+ ip -net ns2-65abe8a3-O2ZCgd route add default via dead:beef:2::2
+ ip netns exec ns2-65abe8a3-O2ZCgd sysctl -q net.ipv4.ip_forward=1
+ ip netns exec ns2-65abe8a3-O2ZCgd sysctl -q net.ipv6.conf.all.forwarding=1
+ ip -net ns3-65abe8a3-O2ZCgd addr add 10.0.2.2/24 dev ns3eth2
+ ip -net ns3-65abe8a3-O2ZCgd addr add dead:beef:2::2/64 dev ns3eth2 nodad
+ ip -net ns3-65abe8a3-O2ZCgd link set ns3eth2 up
+ ip -net ns3-65abe8a3-O2ZCgd addr add 10.0.3.2/24 dev ns3eth4
+ ip -net ns3-65abe8a3-O2ZCgd addr add dead:beef:3::2/64 dev ns3eth4 nodad
+ ip -net ns3-65abe8a3-O2ZCgd link set ns3eth4 up
+ ip -net ns3-65abe8a3-O2ZCgd route add default via 10.0.2.1
+ ip -net ns3-65abe8a3-O2ZCgd route add default via dead:beef:2::1
+ ip netns exec ns3-65abe8a3-O2ZCgd sysctl -q net.ipv4.ip_forward=1
+ ip netns exec ns3-65abe8a3-O2ZCgd sysctl -q net.ipv6.conf.all.forwarding=1
+ ip -net ns4-65abe8a3-O2ZCgd addr add 10.0.3.1/24 dev ns4eth3
+ ip -net ns4-65abe8a3-O2ZCgd addr add dead:beef:3::1/64 dev ns4eth3 nodad
+ ip -net ns4-65abe8a3-O2ZCgd link set ns4eth3 up
+ ip -net ns4-65abe8a3-O2ZCgd route add default via 10.0.3.2
+ ip -net ns4-65abe8a3-O2ZCgd route add default via dead:beef:3::2
+ ip netns exec ns4-65abe8a3-O2ZCgd ethtool -K ns4eth3 tso off
+ dd if=/dev/urandom of=/tmp/tmp.BWY7Jw45jg bs=1024 count=1879
+ dd if=/dev/urandom conv=notrunc of=/tmp/tmp.BWY7Jw45jg oflag=append bs=1 count=100
+ dd if=/dev/urandom of=/tmp/tmp.19cAx2Eg8O bs=1024 count=2371
+ dd if=/dev/urandom conv=notrunc of=/tmp/tmp.19cAx2Eg8O oflag=append bs=1 count=357
+ ip netns add ns_disabled-65abe8a3-O2ZCgd
++ ip netns exec ns_disabled-65abe8a3-O2ZCgd sysctl net.mptcp.enabled | awk '{ print $3 }'
+ '[' 1 -ne 1 ']'
+ ip netns exec ns_disabled-65abe8a3-O2ZCgd sysctl -q net.mptcp.enabled=0
+ local err=0
+ ip netns exec ns_disabled-65abe8a3-O2ZCgd ./mptcp_connect -p 10000 -s MPTCP 127.0.0.1 | grep -q '^socket: Protocol not available$'
+ err=1
+ ip netns delete ns_disabled-65abe8a3-O2ZCgd
+ '[' 1 -eq 0 ']'
New MPTCP socket can be blocked via sysctl [ OK ]
INFO: validating network environment with pings
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.1.1
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:1::1
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.1.2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:1::2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.2.1
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:2::1
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.2.2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:2::2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.3.2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:3::2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.3.1
<crash>
Kernel config: config.gz
Short summary of the discussion we had on lore:
"Intel SDM Vol3A, 9.1.3 Handling Self- and Cross-Modifying Code" said that what the other CPU needs to do is "Execute serializing instruction; ( For example, CPUID instruction )" for cross-modifying code. that has been done in do_sync_core(). Thus this bug should not happen.
x86: Fixup from the removed INT3 if it is unhandled
")Next steps are:
reopening: we still have the issue without the workaround (kernel patch) with QEmu 8.0.4 that is installed in the docker used by the CI to execute the tests.
Next steps: try to identify the fix on QEmu side and have it backported (or upgrade QEmu manually?)
Note that we just had the issue with the kernel patch as a workaround:
+ ./mptcp_connect.sh -m mmap
+ tee /github/workspace/mptcp_connect_mmap.tap.tmp
+ /github/workspace/tools/testing/selftests/kselftest/prefix.pl
# INFO: set ns4-65ce3396-YqN01V dev ns4eth3: ethtool -K gso off gro off
# Created /tmp/tmp.A4VMEbiPE9 (size 5537555 /tmp/tmp.A4VMEbiPE9) containing data sent by client
# Created /tmp/tmp.nXZrvFIw8k (size 4784112 /tmp/tmp.nXZrvFIw8k) containing data sent by server
# New MPTCP socket can be blocked via sysctl [ OK ]
# INFO: validating network environment with pings
[ 1620.690258] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 1620.690586] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G N 6.8.0-rc3-g39cb90ad6cf5 #1
[ 1620.690586] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 1620.690586] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1620.690586] Code: e9 fd fe ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 53 48 89 fb 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 cc <dc> 00 00 00 0f 1f 44 00 00 66 90 c7 44 24 08 00 00 00 00 48 89 df
All code
========
0: e9 fd fe ff ff jmp 0xffffffffffffff02
5: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 53 push %rbx
12: 48 89 fb mov %rdi,%rbx
15: 48 83 ec 18 sub $0x18,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 10 mov %rax,0x10(%rsp)
27: 31 c0 xor %eax,%eax
29: cc int3
2a:* dc 00 faddl (%rax) <-- trapping instruction
2c: 00 00 add %al,(%rax)
2e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
33: 66 90 xchg %ax,%ax
35: c7 44 24 08 00 00 00 movl $0x0,0x8(%rsp)
3c: 00
3d: 48 89 df mov %rbx,%rdi
Code starting with the faulting instruction
===========================================
0: dc 00 faddl (%rax)
2: 00 00 add %al,(%rax)
4: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
9: 66 90 xchg %ax,%ax
b: c7 44 24 08 00 00 00 movl $0x0,0x8(%rsp)
12: 00
13: 48 89 df mov %rbx,%rdi
[ 1620.690586] RSP: 0018:ffffae3bc011cbf0 EFLAGS: 00000246
[ 1620.690586] RAX: 0000000000000000 RBX: ffff998843b31800 RCX: 0000000000000002
[ 1620.690586] RDX: 0000000000000002 RSI: ffff998842c45000 RDI: ffff998843b31800
[ 1620.690586] RBP: ffff998842c44000 R08: ffff998844149a00 R09: 0000000000000000
[ 1620.690586] R10: ffff998844538980 R11: ffff9988419a7238 R12: 0000000000000000
[ 1620.690586] R13: 0000000000000046 R14: 0000000000000000 R15: ffff9988446ba000
[ 1620.690586] FS: 0000000000000000(0000) GS:ffff9988bdd00000(0000) knlGS:0000000000000000
[ 1620.690586] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1620.690586] CR2: 000055bd409dfca0 CR3: 00000000025fa000 CR4: 00000000000006f0
[ 1620.690586] Call Trace:
[ 1620.690586] <IRQ>
[ 1620.690586] ? die (arch/x86/kernel/dumpstack.c:421)
[ 1620.690586] ? exc_int3 (arch/x86/kernel/traps.c:781)
[ 1620.690586] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569)
[ 1620.690586] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1620.690586] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1620.690586] __netif_rx (net/core/dev.c:5092)
[ 1620.690586] veth_xmit (drivers/net/veth.c:374 (discriminator 2))
[ 1620.690586] dev_hard_start_xmit (include/linux/netdevice.h:4989)
[ 1620.690586] __dev_queue_xmit (include/linux/netdevice.h:3367 (discriminator 25))
[ 1620.690586] ip6_finish_output2 (include/linux/netdevice.h:3171)
[ 1620.690586] ? ip6_output (include/linux/netfilter.h:301 (discriminator 1))
[ 1620.690586] ? ip6_mtu (net/ipv6/route.c:3217)
[ 1620.690586] ndisc_send_skb (net/ipv6/ndisc.c:512)
[ 1620.690586] addrconf_rs_timer (net/ipv6/addrconf.c:4000)
[ 1620.690586] ? ipv6_get_lladdr (net/ipv6/addrconf.c:3976)
[ 1620.690586] call_timer_fn (arch/x86/include/asm/jump_label.h:27)
[ 1620.690586] ? ipv6_get_lladdr (net/ipv6/addrconf.c:3976)
[ 1620.690586] __run_timers (kernel/time/timer.c:1752)
[ 1620.690586] run_timer_softirq (kernel/time/timer.c:2053 (discriminator 1))
[ 1620.690586] __do_softirq (arch/x86/include/asm/jump_label.h:27)
[ 1620.690586] irq_exit_rcu (kernel/softirq.c:427)
[ 1620.690586] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 47))
[ 1620.690586] </IRQ>
[ 1620.690586] <TASK>
[ 1620.690586] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:649)
[ 1620.690586] RIP: 0010:default_idle (arch/x86/include/asm/irqflags.h:37)
[ 1620.690586] Code: 89 07 49 c7 c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 76 ff ff ff cc cc cc cc f3 0f 1e fa eb 07 0f 00 2d d3 54 37 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 65
All code
========
0: 89 07 mov %eax,(%rdi)
2: 49 c7 c0 08 00 00 00 mov $0x8,%r8
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 53 push %rbx
12: 48 89 fb mov %rdi,%rbx
15: 48 83 ec 18 sub $0x18,%rsp
19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
20: 00 00
22: 48 89 44 24 10 mov %rax,0x10(%rsp)
27: 31 c0 xor %eax,%eax
29: cc int3
2a:* dc 00 faddl (%rax) <-- trapping instruction
2c: 00 00 add %al,(%rax)
2e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
33: 66 90 xchg %ax,%ax
35: c7 44 24 08 00 00 00 movl $0x0,0x8(%rsp)
3c: 00
3d: 48 89 df mov %rbx,%rdi
Code starting with the faulting instruction
===========================================
0: dc 00 faddl (%rax)
2: 00 00 add %al,(%rax)
4: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
9: 66 90 xchg %ax,%ax
b: c7 44 24 08 00 00 00 movl $0x0,0x8(%rsp)
12: 00
13: 48 89 df mov %rbx,%rdi
[ 1620.690586] RSP: 0018:ffffae3bc011cbf0 EFLAGS: 00000246
[ 1620.690586] RAX: 0000000000000000 RBX: ffff998843b31800 RCX: 0000000000000002
[ 1620.690586] RDX: 0000000000000002 RSI: ffff998842c45000 RDI: ffff998843b31800
[ 1620.690586] RBP: ffff998842c44000 R08: ffff998844149a00 R09: 0000000000000000
[ 1620.690586] R10: ffff998844538980 R11: ffff9988419a7238 R12: 0000000000000000
[ 1620.690586] R13: 0000000000000046 R14: 0000000000000000 R15: ffff9988446ba000
[ 1620.690586] FS: 0000000000000000(0000) GS:ffff9988bdd00000(0000) knlGS:0000000000000000
[ 1620.690586] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1620.690586] CR2: 000055bd409dfca0 CR3: 00000000025fa000 CR4: 00000000000006f0
[ 1620.690586] Kernel panic - not syncing: Fatal exception in interrupt
[ 1620.690586] Kernel Offset: 0x1fe00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Unexpected stop of the VM
https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7918253003/job/21616220869
Next steps: try to identify the fix on QEmu side and have it backported (or upgrade QEmu manually?)
The issue has been fixed in QEmu v8.1.0, but not backported earlier. And it looks like there will not be any new v8.0 releases.
The fixes on QEmu's side:
There are some conflicts when backporting them to v8.0.4, but it is not blocking. I resolved the conflicts and pushed these 3 commits in this branch:
https://gitlab.com/matttbe/qemu/-/commits/lp-2051965/
Thanks to Canonical Server devs, Ubuntu 23.10 (and maybe 22.04 too) will get a new version with the fixes. Once it is available, we can revert the kernel patch acting as workaround, and close this issue.
For more details: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2051965
QEmu 8.0.4+dfsg-1ubuntu3.23.10.3 in Ubuntu 23.10 now includes a fix to avoid the kernel panic.
I then reverted the workaround from our tree.
Note that the workaround is also no longer needed since all our CIs are now using KVM support #474
New patches for t/upstream-net and t/upstream:
Tests are now in progress:
A kernel panic has been detected by the CI (no debug kconfig).
Click to expand but probably ignore this one, no debug info
It looks like it is not related to MPTCP. Due to a global timeout, the trace has not been decoded and the
vmlinux
file has not been saved.Anyway, logging it here, just in case. I just relaunched the job, hoping to be able to reproduce it (no issues on my side).