multipath-tcp / mptcp_net-next

Development version of the Upstream MultiPath TCP Linux kernel 🐧
https://mptcp.dev
Other
290 stars 41 forks source link

syzkaller: WARNING in `__mptcp_clean_una` #485

Closed cpaasch closed 7 months ago

cpaasch commented 7 months ago

syzkaller-id: 1a7fbf9ed6cbc80305d5bf808b47edb978a3803c

HEAD: bbeac67456c9

Trace:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 38 at net/mptcp/protocol.c:1005 __mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005
Modules linked in:
CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 6.9.0-rc1-gbbeac67456c9 #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:__mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005
Code: be 06 01 00 00 bf 06 01 00 00 e8 a8 12 e7 fe e9 00 fe ff ff e8 8e 1a e7 fe 0f b7 ab 3e 02 00 00 e9 d3 fd ff ff e8 7d 1a e7 fe <0f> 0b 4c 8b bb e0 05 00 00 e9 74 fc ff ff e8 6a 1a e7 fe 0f 0b e9
RSP: 0018:ffffc9000013fd48 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8881029bd280 RCX: ffffffff82382fe4
RDX: ffff8881003cbd00 RSI: ffffffff823833c3 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888138ba8000
R13: 0000000000000106 R14: ffff8881029bd908 R15: ffff888126560000
FS:  0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f604a5dae38 CR3: 0000000101dac002 CR4: 0000000000170ef0
Call Trace:
 <TASK>
 __mptcp_clean_una_wakeup net/mptcp/protocol.c:1055 [inline]
 mptcp_clean_una_wakeup net/mptcp/protocol.c:1062 [inline]
 __mptcp_retrans+0x7f/0x7e0 net/mptcp/protocol.c:2615
 mptcp_worker+0x434/0x740 net/mptcp/protocol.c:2767
 process_one_work+0x1e0/0x560 kernel/workqueue.c:3254
 process_scheduled_works kernel/workqueue.c:3335 [inline]
 worker_thread+0x3c7/0x640 kernel/workqueue.c:3416
 kthread+0x121/0x170 kernel/kthread.c:388
 ret_from_fork+0x44/0x50 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
 </TASK>
---[ end trace 0000000000000000 ]---

Kconfig: Kconfig_k7_clean.txt

Reproducer (on k9):

# {Threaded:false Repeat:true RepeatTimes:0 Procs:8 Slowdown:1 Sandbox: SandboxArg:0 Leak:false NetInjection:false NetDevices:false NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:false Swap:false UseTmpDir:true HandleSegv:false Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
r0 = socket$inet6_mptcp(0xa, 0x1, 0x106)
bind$inet6(r0, &(0x7f0000000080)={0xa, 0x4e23, 0x0, @loopback}, 0x1c)
connect$inet6(r0, &(0x7f00000000c0)={0xa, 0x4e23, 0x0, @loopback}, 0x1c)
r1 = socket$inet6_udp(0xa, 0x2, 0x0)
setsockopt$IP6T_SO_SET_REPLACE(r1, 0x29, 0x40, &(0x7f00000000c0)=@raw={'raw\x00', 0x9, 0x3, 0x278, 0xd8, 0xffffffff, 0xfdffffff, 0x1a8, 0xffffffff, 0xd8, 0xffffffff, 0xffffffff, 0x1a8, 0xffffffff, 0x3, 0x0, {[{{@ipv6={@loopback, @private2, [], [], 'bridge_slave_1\x00', 'veth1_to_bridge\x00'}, 0x0, 0xa8, 0xd8}, @common=@unspec=@CONNMARK={0x30}}, {{@uncond, 0x0, 0xa8, 0xd0}, @common=@unspec=@STANDARD={0x28, '\x00', 0x0, 0xffffffffffffffff}}], {{'\x00', 0x0, 0xa8, 0xd0}, {0x28}}}}, 0x318)
sendmmsg$inet6(r0, &(0x7f0000002100)=[{{0x0, 0x0, &(0x7f0000000340)=[{&(0x7f0000000140)="ea", 0x13618c2}], 0x1}}, {{0x0, 0x0, &(0x7f0000000500)=[{&(0x7f0000000400)='f', 0x1}], 0x1}}, {{0x0, 0x0, &(0x7f0000001900)=[{&(0x7f0000000640)="d2", 0x1}], 0x1}}, {{0x0, 0x0, &(0x7f0000001c40)=[{&(0x7f0000001b80)="02", 0x1}], 0x1}}], 0x4, 0x0)
pabeni commented 7 months ago

The reproducer does a self-connect, causing fallback and setups some nft filter I can't decode. I guess it drops or delay some packets.

The splat happens on mptcp-level re-injection. A simple fix (work-around) would probably be skipping re-injection for fallback flows.

Still is more interesting to understand why the splat happens - apparently snd_una moved over snd_nxt.

For fallback socket, under the mptcp socket lock and the mptcp data lock scope snd_una == snd_nxt, so it's not clear the root cause.

pabeni commented 7 months ago

It looks like 'snd_una' is not initialized for fallback, before the first ack is received.

the attached patch should fix the issue. @cpaasch: could you please test it? una.patch.txt

cpaasch commented 7 months ago

@pabeni - latest patch works : c4965fb58d4c