multipath-tcp / mptcp_net-next

Development version of the Upstream MultiPath TCP Linux kernel 🐧
https://mptcp.dev
Other
292 stars 42 forks source link

syzkaller: possible deadlock in `sk_clone_lock` #438

Closed cpaasch closed 1 year ago

cpaasch commented 1 year ago

syzkaller-id: 664c5311cf44e3aa732eaa6b2e79eb4a8961ec08

HEAD: 2227de86e754

Trace:

============================================
WARNING: possible recursive locking detected
6.5.0-g2227de86e754 #45 Not tainted
--------------------------------------------
syz-executor.5/3123 is trying to acquire lock:
ffff88803b4b4430 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffff88803b4b4430 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x12a/0x690 net/core/sock.c:2317

but task is already holding lock:
ffff88803e123b70 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffff88803e123b70 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x12a/0x690 net/core/sock.c:2317

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(k-slock-AF_INET);
  lock(k-slock-AF_INET);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

7 locks held by syz-executor.5/3123:
 #0: ffff88803b4b3970 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1722 [inline]
 #0: ffff88803b4b3970 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x40/0x850 net/mptcp/protocol.c:1780
 #1: ffff88803e1247b0 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1722 [inline]
 #1: ffff88803e1247b0 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg_fastopen+0x7f/0x200 net/mptcp/protocol.c:1737
 #2: ffffffff83559bb8 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x5/0x40 include/linux/rcupdate.h:302
 #3: ffffffff83559bb8 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x5/0x40 include/linux/rcupdate.h:302
 #4: ffffffff83559bb8 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x9/0x40 include/linux/rcupdate.h:303
 #5: ffffffff83559bb8 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x5/0x40 include/linux/rcupdate.h:302
 #6: ffff88803e123b70 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
 #6: ffff88803e123b70 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x12a/0x690 net/core/sock.c:2317

stack backtrace:
CPU: 3 PID: 3123 Comm: syz-executor.5 Not tainted 6.5.0-g2227de86e754 #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xad/0xf0 lib/dump_stack.c:106
 check_deadlock kernel/locking/lockdep.c:3070 [inline]
 validate_chain kernel/locking/lockdep.c:3863 [inline]
 __lock_acquire+0x12b6/0x2dd0 kernel/locking/lockdep.c:5144
 lock_acquire+0xdd/0x230 kernel/locking/lockdep.c:5761
 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
 _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
 spin_lock include/linux/spinlock.h:351 [inline]
 sk_clone_lock+0x12a/0x690 net/core/sock.c:2317
 mptcp_sk_clone_init+0x2c/0x3c0 net/mptcp/protocol.c:3167
 subflow_syn_recv_sock+0x41d/0x7e0 net/mptcp/subflow.c:818
 tcp_check_req+0x794/0x950 net/ipv4/tcp_minisocks.c:807
 tcp_v4_rcv+0xea5/0x1680 net/ipv4/tcp_ipv4.c:2077
 ip_protocol_deliver_rcu+0x300/0x5e0 net/ipv4/ip_input.c:205
 ip_local_deliver_finish+0x134/0x210 net/ipv4/ip_input.c:233
 NF_HOOK+0x23a/0x2a0 include/linux/netfilter.h:304
 NF_HOOK+0x23a/0x2a0 include/linux/netfilter.h:304
 __netif_receive_skb+0xe4/0x210 net/core/dev.c:5523
 process_backlog+0x221/0x3a0 net/core/dev.c:5965
 __napi_poll+0x46/0x330 net/core/dev.c:6527
 napi_poll net/core/dev.c:6594 [inline]
 net_rx_action+0x216/0x4e0 net/core/dev.c:6727
 __do_softirq+0x158/0x3e6 kernel/softirq.c:553
 do_softirq+0x8b/0xd0 kernel/softirq.c:454
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x10c/0x120 kernel/softirq.c:381
 rcu_read_unlock_bh include/linux/rcupdate.h:819 [inline]
 __dev_queue_xmit+0xb93/0x1bb0 net/core/dev.c:4367
 dev_queue_xmit include/linux/netdevice.h:3082 [inline]
 neigh_hh_output include/net/neighbour.h:526 [inline]
 neigh_output include/net/neighbour.h:540 [inline]
 ip_finish_output2+0x63c/0x7e0 net/ipv4/ip_output.c:233
 __ip_queue_xmit+0x88b/0x9a0 net/ipv4/ip_output.c:533
 __tcp_transmit_skb+0xdf6/0x1010 net/ipv4/tcp_output.c:1416
 tcp_rcv_state_process+0x16b0/0x1720 net/ipv4/tcp_input.c:6348
 tcp_v4_do_rcv+0x3dd/0x620 net/ipv4/tcp_ipv4.c:1751
 __release_sock+0xcf/0x150 net/core/sock.c:2983
 release_sock+0x38/0xf0 net/core/sock.c:3520
 mptcp_sendmsg_fastopen+0xb9/0x200 net/mptcp/protocol.c:1743
 mptcp_sendmsg+0x786/0x850 net/mptcp/protocol.c:1786
 sock_sendmsg+0x87/0xd0 net/socket.c:728
 __sys_sendto+0x20d/0x2b0 net/socket.c:2175
 __do_sys_sendto net/socket.c:2187 [inline]
 __se_sys_sendto net/socket.c:2183 [inline]
 __x64_sys_sendto+0x28/0x30 net/socket.c:2183
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7f37a70746a9
Code: 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 37 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007f37a63a1cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00000000006bbf80 RCX: 00007f37a70746a9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000020000040 R09: 0000000000000010
R10: 000000002002c011 R11: 0000000000000246 R12: 00000000006bbf8c
R13: fffffffffffffea8 R14: 00000000006bbf80 R15: 000000000001fe40
 </TASK>

No reproducer

Kconfig: Kconfig_k5_lockdep.txt

pabeni commented 1 year ago

I can't understand this stack trace: there is only one sk_clone_lock() call in it, I don't see how that could trigger a recursive lock?!? Unless there is some prior data corruption (kasan is not enabled, could possibly go unnoticed until too late) confusing lockdep.

matttbe commented 1 year ago

Some notes from the meeting of the 12th of september:

pabeni commented 1 year ago

this really looks like a duplicate of #447