multipath-tcp / mptcp_net-next

Development version of the Upstream MultiPath TCP Linux kernel 🐧
https://mptcp.dev
Other
284 stars 42 forks source link

syzkaller: soft lockup in `mptcp_token_exists()` #365

Closed matttbe closed 1 year ago

matttbe commented 1 year ago

With this syzkaller reproducer from #347:

# {Threaded:false Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:none SandboxArg:0 Leak:false NetInjection:false NetDevices:true NetReset:true Cgroups:true BinfmtMisc:false CloseFDs:true KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:false UseTmpDir:true HandleSegv:false Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
r0 = socket$inet_mptcp(0x2, 0x1, 0x106)
bind$inet(r0, &(0x7f0000002200)={0x2, 0x4e20, @local}, 0x10)
listen(r0, 0x0)
r1 = socket$inet_mptcp(0x2, 0x1, 0x106)
sendto$inet(r1, 0x0, 0x0, 0x2000c000, &(0x7f0000000000)={0x2, 0x4e20, @local}, 0x10)

@cpaasch hit a soft lockup when checking on net and net-next:

[   64.674605] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [syz-executor:3338]
[   64.676438] Modules linked in:
[   64.677168] CPU: 1 PID: 3338 Comm: syz-executor Not tainted 6.2.0-rc8+ #32
[   64.678763] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[   64.681215] RIP: 0010:mptcp_token_exists+0xd1/0x160
[   64.682338] Code: 35 a0 5b fe 48 89 d8 48 c1 e8 03 42 80 3c 20 00 0f 85 80 00 00 00 48 8b 1b 41 89 df 31 ff 41 83 e7 01 44 89 fe e8 5f 98 5b fe <45> 85 ff 74 8d e8 05 a0 5b fe 48 d1 eb 41 89 ef 44 23 3d 00 3b 12
[   64.686234] RSP: 0018:ffff88811b7094b8 EFLAGS: 00000246
[   64.687355] RAX: 0000000000000000 RBX: 0000000000000ed7 RCX: ffffffff82e0ef41
[   64.688940] RDX: ffff888107841c00 RSI: 0000000000000100 RDI: 0000000000000005
[   64.690639] RBP: 000000008e935705 R08: 0000000000000005 R09: 0000000000000000
[   64.692231] R10: 0000000000000001 R11: 0000000000000000 R12: dffffc0000000000
[   64.693795] R13: ffffed10201a1511 R14: ffff888100d0a878 R15: 0000000000000001
[   64.695455] FS:  00007fde38e36800(0000) GS:ffff88811b700000(0000) knlGS:0000000000000000
[   64.697431] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   64.698852] CR2: 000000000047c550 CR3: 0000000107964000 CR4: 00000000000006e0
[   64.700502] Call Trace:
[   64.701073]  <IRQ>
[   64.701564]  subflow_check_req+0x9d1/0xe70
[   64.702526]  ? __pfx_subflow_check_req+0x10/0x10
[   64.703597]  ? unwind_get_return_address+0x55/0xa0
[   64.704698]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[   64.705929]  ? ip_route_output_flow+0x225/0x2c0
[   64.706974]  ? __pfx_ip_route_output_flow+0x10/0x10
[   64.708086]  ? kmem_cache_alloc+0x177/0x310
[   64.709030]  ? inet_csk_route_req+0x6f4/0x9b0
[   64.710009]  subflow_v4_route_req+0x1f1/0x350
[   64.710991]  tcp_conn_request+0xb29/0x2d40

decode_stacktrace.sh and eventually a bisect will be needed here.

_Originally posted by @cpaasch in https://github.com/multipath-tcp/mptcp_net-next/issues/347#issuecomment-1438853979_

matttbe commented 1 year ago

As discussed on IRC, I moved the two patches back to -net + mptcp: refactor passive socket initialization:

New patches for t/upstream-net:

and t/upstream:

Tests are now in progress:


@cpaasch the reproducer should then not hit the bug on our export-net now. Do not hesitate to verify and re-open this ticket if not :-)