Closed matttbe closed 1 year ago
I just noticed the CI at Tessares found a very similar issue at the same time:
01:03:54.304 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_CONNECT-no-cookie.pkt (ipv6)]
01:03:54.304 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-MSG_FASTOPEN.pkt (ipv4-mapped-v6)]
01:03:54.305 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_CONNECT-no-cookie.pkt (ipv4-mapped-v6)]
01:03:54.305 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_CONNECT.pkt (ipv4-mapped-v6)]
01:03:54.305 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_NO_COOKIE.pkt (ipv4)]
01:03:54.305 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_NO_COOKIE.pkt (ipv6)]
01:03:54.305 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_CONNECT-no-cookie.pkt (ipv4)]
01:03:54.306 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_CONNECT.pkt (ipv6)]
01:03:54.306 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/fastopen-invalid-buf-ptr.pkt (ipv4)]
01:03:54.306 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_CONNECT.pkt (ipv4)]
01:03:54.306 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/fastopen-invalid-buf-ptr.pkt (ipv4-mapped-v6)]
01:03:54.306 [ 1739.789398]
01:03:54.306 [ 1739.789646] ======================================================
01:03:54.306 [ 1739.790487] WARNING: possible circular locking dependency detected
01:03:54.306 [ 1739.791309] 6.1.0-rc7-ge4c90e57839e #2 Tainted: G N
01:03:54.307 [ 1739.791803] ------------------------------------------------------
01:03:54.307 [ 1739.792531] packetdrill/27464 is trying to acquire lock:
01:03:54.573 [ 1739.793063] ffff88800bdd68e0 ((work_completion)(&msk->work)){+.+.}-{0:0}, at: __flush_work (kernel/workqueue.c:3069)
01:03:54.573 [ 1739.793973]
01:03:54.573 [ 1739.793973] but task is already holding lock:
01:03:54.860 [ 1739.794595] ffff888001bfc130 (sk_lock-AF_INET6){+.+.}-{0:0}, at: mptcp_close (net/mptcp/protocol.c:2973)
01:03:54.860 [ 1739.795383]
01:03:54.861 [ 1739.795383] which lock already depends on the new lock.
01:03:54.861 [ 1739.795383]
01:03:54.861 [ 1739.796353]
01:03:54.861 [ 1739.796353] the existing dependency chain (in reverse order) is:
01:03:54.861 [ 1739.797130]
01:03:54.861 [ 1739.797130] -> #1 (sk_lock-AF_INET6){+.+.}-{0:0}:
01:03:55.150 [ 1739.797724] lock_acquire (kernel/locking/lockdep.c:466)
01:03:55.437 [ 1739.798039] lock_sock_nested (net/core/sock.c:3463)
01:03:55.724 [ 1739.798359] mptcp_worker (net/mptcp/protocol.c:2614)
01:03:56.012 [ 1739.798658] process_one_work (kernel/workqueue.c:2289)
01:03:56.308 [ 1739.799007] worker_thread (./include/linux/list.h:292)
01:03:56.591 [ 1739.799313] kthread (kernel/kthread.c:376)
01:03:56.878 [ 1739.799610] ret_from_fork (arch/x86/entry/entry_64.S:306)
01:03:56.878 [ 1739.799921]
01:03:56.878 [ 1739.799921] -> #0 ((work_completion)(&msk->work)){+.+.}-{0:0}:
01:03:57.154 [ 1739.800467] check_prev_add (kernel/locking/lockdep.c:3098)
01:03:57.421 [ 1739.800794] __lock_acquire (kernel/locking/lockdep.c:3217)
01:03:57.423 [ 1739.801149] lock_acquire (kernel/locking/lockdep.c:466)
01:03:57.597 [ 1739.801445] __flush_work (kernel/workqueue.c:3070)
01:03:57.847 [ 1739.801765] __cancel_work_timer (kernel/workqueue.c:3160)
01:03:58.135 [ 1739.802274] mptcp_cancel_work (net/mptcp/protocol.c:2758)
01:03:58.421 [ 1739.802731] mptcp_subflow_queue_clean (net/mptcp/subflow.c:1817)
01:03:58.690 [ 1739.803303] __mptcp_close_ssk (net/mptcp/protocol.c:2363)
01:03:59.008 [ 1739.803667] mptcp_destroy_common (net/mptcp/protocol.c:3170 (discriminator 3))
01:03:59.268 [ 1739.804052] mptcp_destroy (./include/net/sock.h:1495)
01:03:59.538 [ 1739.804349] __mptcp_destroy_sock (net/mptcp/protocol.c:2884)
01:03:59.808 [ 1739.804695] __mptcp_close (net/mptcp/protocol.c:2959)
01:04:00.026 [ 1739.805010] mptcp_close (net/mptcp/protocol.c:2974)
01:04:00.309 [ 1739.805294] inet_release (net/ipv4/af_inet.c:431)
01:04:00.578 [ 1739.805587] __sock_release (net/socket.c:650)
01:04:00.873 [ 1739.805930] sock_close (net/socket.c:1367)
01:04:01.123 [ 1739.806281] __fput (fs/file_table.c:320)
01:04:01.414 [ 1739.806546] task_work_run (kernel/task_work.c:179 (discriminator 1))
01:04:01.692 [ 1739.806860] exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49)
01:04:01.985 [ 1739.807250] syscall_exit_to_user_mode (kernel/entry/common.c:130)
01:04:02.262 [ 1739.807608] do_syscall_64 (arch/x86/entry/common.c:87)
01:04:02.540 [ 1739.807922] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
01:04:02.540 [ 1739.808406]
01:04:02.540 [ 1739.808406] other info that might help us debug this:
01:04:02.540 [ 1739.808406]
01:04:02.540 [ 1739.808988] Possible unsafe locking scenario:
01:04:02.540 [ 1739.808988]
01:04:02.540 [ 1739.809454] CPU0 CPU1
01:04:02.540 [ 1739.809786] ---- ----
01:04:02.541 [ 1739.810204] lock(sk_lock-AF_INET6);
01:04:02.541 [ 1739.810467] lock((work_completion)(&msk->work));
01:04:02.541 [ 1739.811010] lock(sk_lock-AF_INET6);
01:04:02.541 [ 1739.811437] lock((work_completion)(&msk->work));
01:04:02.541 [ 1739.811774]
01:04:02.541 [ 1739.811774] *** DEADLOCK ***
01:04:02.541 [ 1739.811774]
01:04:02.541 [ 1739.812253] 2 locks held by packetdrill/27464:
01:04:02.716 [ 1739.812576] #0: ffff88800d8e0210 (&sb->s_type->i_mutex_key#6){+.+.}-{3:3}, at: __sock_release (net/socket.c:650)
01:04:02.718 [ 1739.813264] #1: ffff888001bfc130 (sk_lock-AF_INET6){+.+.}-{0:0}, at: mptcp_close (net/mptcp/protocol.c:2973)
01:04:02.718 [ 1739.813861]
01:04:02.718 [ 1739.813861] stack backtrace:
01:04:02.718 [ 1739.814196] CPU: 0 PID: 27464 Comm: packetdrill Tainted: G N 6.1.0-rc7-ge4c90e57839e #2
01:04:02.719 [ 1739.814840] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
01:04:02.719 [ 1739.815529] Call Trace:
01:04:02.719 [ 1739.815722] <TASK>
01:04:02.998 [ 1739.815909] dump_stack_lvl (lib/dump_stack.c:107)
01:04:03.272 [ 1739.816248] check_noncircular (kernel/locking/lockdep.c:2180)
01:04:03.562 [ 1739.816588] ? print_circular_bug.isra.0 (kernel/locking/lockdep.c:2156)
01:04:03.849 [ 1739.817015] ? mark_lock.part.0 (kernel/locking/lockdep.c:4612)
01:04:04.131 [ 1739.817339] ? kernel_text_address (kernel/extable.c:97)
01:04:04.403 [ 1739.817658] ? arch_stack_walk (arch/x86/kernel/stacktrace.c:26)
01:04:04.693 [ 1739.817957] ? register_lock_class (kernel/locking/lockdep.c:1381 (discriminator 1))
01:04:04.695 [ 1739.818286] check_prev_add (kernel/locking/lockdep.c:3098)
01:04:05.018 [ 1739.818570] ? stack_trace_save (kernel/stacktrace.c:123)
01:04:05.020 [ 1739.818864] __lock_acquire (kernel/locking/lockdep.c:3217)
01:04:05.295 [ 1739.819171] ? alloc_list_entry (./arch/x86/include/asm/bitops.h:68)
01:04:05.470 [ 1739.819461] ? register_lock_class (kernel/locking/lockdep.c:4913)
01:04:05.759 [ 1739.819778] ? rcu_read_lock_sched_held (./include/linux/lockdep.h:283)
01:04:05.761 [ 1739.820213] lock_acquire (kernel/locking/lockdep.c:466)
01:04:05.763 [ 1739.820483] ? __flush_work (kernel/workqueue.c:3069)
01:04:06.046 [ 1739.820762] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:5636)
01:04:06.048 [ 1739.821136] ? mark_lock.part.0 (kernel/locking/lockdep.c:4612)
01:04:06.050 [ 1739.821450] ? register_lock_class (kernel/locking/lockdep.c:4913)
01:04:06.325 [ 1739.821775] ? check_chain_key (kernel/locking/lockdep.c:3894)
01:04:06.326 [ 1739.822087] __flush_work (kernel/workqueue.c:3070)
01:04:06.328 [ 1739.822345] ? __flush_work (kernel/workqueue.c:3069)
01:04:06.617 [ 1739.822614] ? queue_delayed_work_on (kernel/workqueue.c:3060)
01:04:06.618 [ 1739.822958] ? register_lock_class (kernel/locking/lockdep.c:4913)
01:04:06.902 [ 1739.823337] ? mark_held_locks (kernel/locking/lockdep.c:4236)
01:04:06.904 [ 1739.823618] ? mark_lock.part.0 (kernel/locking/lockdep.c:4612)
01:04:06.906 [ 1739.823982] ? mark_held_locks (kernel/locking/lockdep.c:4236)
01:04:06.908 [ 1739.824270] __cancel_work_timer (kernel/workqueue.c:3160)
01:04:07.181 [ 1739.824587] ? cancel_delayed_work (kernel/workqueue.c:3113)
01:04:07.358 [ 1739.824905] ? mptcp_subflow_queue_clean (net/mptcp/subflow.c:1816)
01:04:07.360 [ 1739.825268] ? mark_held_locks (kernel/locking/lockdep.c:4236)
01:04:07.535 [ 1739.825546] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:466)
01:04:07.537 [ 1739.825946] mptcp_cancel_work (net/mptcp/protocol.c:2758)
01:04:07.539 [ 1739.826295] mptcp_subflow_queue_clean (net/mptcp/subflow.c:1817)
01:04:07.541 [ 1739.826635] __mptcp_close_ssk (net/mptcp/protocol.c:2363)
01:04:07.543 [ 1739.826993] mptcp_destroy_common (net/mptcp/protocol.c:3170 (discriminator 3))
01:04:07.545 [ 1739.827299] mptcp_destroy (./include/net/sock.h:1495)
01:04:07.547 [ 1739.827557] __mptcp_destroy_sock (net/mptcp/protocol.c:2884)
01:04:07.549 [ 1739.827861] __mptcp_close (net/mptcp/protocol.c:2959)
01:04:07.551 [ 1739.828248] mptcp_close (net/mptcp/protocol.c:2974)
01:04:07.552 [ 1739.828496] inet_release (net/ipv4/af_inet.c:431)
01:04:07.554 [ 1739.828796] __sock_release (net/socket.c:650)
01:04:07.556 [ 1739.829086] sock_close (net/socket.c:1367)
01:04:07.558 [ 1739.829325] __fput (fs/file_table.c:320)
01:04:07.559 [ 1739.829559] task_work_run (kernel/task_work.c:179 (discriminator 1))
01:04:07.851 [ 1739.829821] ? task_work_cancel (kernel/task_work.c:147)
01:04:07.853 [ 1739.830231] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:466)
01:04:07.855 [ 1739.830649] exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49)
01:04:07.857 [ 1739.831079] syscall_exit_to_user_mode (kernel/entry/common.c:130)
01:04:07.859 [ 1739.831416] do_syscall_64 (arch/x86/entry/common.c:87)
01:04:07.860 [ 1739.831680] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
01:04:07.860 [ 1739.832094] RIP: 0033:0x7f03a965f45b
01:04:07.864 [ 1739.832363] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 f3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 31 fc ff ff 8b 44
01:04:07.976 All code
01:04:07.976 ========
01:04:07.976 0: 03 00 add (%rax),%eax
01:04:07.976 2: 00 00 add %al,(%rax)
01:04:07.976 4: 0f 05 syscall
01:04:07.976 6: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
01:04:07.976 c: 77 41 ja 0x4f
01:04:07.976 e: c3 retq
01:04:07.977 f: 48 83 ec 18 sub $0x18,%rsp
01:04:07.977 13: 89 7c 24 0c mov %edi,0xc(%rsp)
01:04:07.977 17: e8 f3 fb ff ff callq 0xfffffffffffffc0f
01:04:07.977 1c: 8b 7c 24 0c mov 0xc(%rsp),%edi
01:04:07.977 20: 41 89 c0 mov %eax,%r8d
01:04:07.977 23: b8 03 00 00 00 mov $0x3,%eax
01:04:07.977 28: 0f 05 syscall
01:04:07.977 2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
01:04:07.977 30: 77 2f ja 0x61
01:04:07.977 32: 44 89 c7 mov %r8d,%edi
01:04:07.977 35: 89 44 24 0c mov %eax,0xc(%rsp)
01:04:07.977 39: e8 31 fc ff ff callq 0xfffffffffffffc6f
01:04:07.977 3e: 8b .byte 0x8b
01:04:07.977 3f: 44 rex.R
01:04:07.977
01:04:07.977 Code starting with the faulting instruction
01:04:07.977 ===========================================
01:04:07.977 0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
01:04:07.977 6: 77 2f ja 0x37
01:04:07.977 8: 44 89 c7 mov %r8d,%edi
01:04:07.977 b: 89 44 24 0c mov %eax,0xc(%rsp)
01:04:07.977 f: e8 31 fc ff ff callq 0xfffffffffffffc45
01:04:07.977 14: 8b .byte 0x8b
01:04:07.977 15: 44 rex.R
01:04:07.978 [ 1739.833645] RSP: 002b:00007ffe3e7105e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
01:04:07.979 [ 1739.834268] RAX: 0000000000000000 RBX: 0000557d35342e30 RCX: 00007f03a965f45b
01:04:07.979 [ 1739.834805] RDX: 0000557d3649a0f0 RSI: 0000557d3649a0f0 RDI: 0000000000000007
01:04:07.979 [ 1739.835315] RBP: 00007ffe3e710620 R08: 0000000000000000 R09: 0000000000000000
01:04:07.979 [ 1739.835804] R10: 0000000000000000 R11: 0000000000000293 R12: 0000557d353040e0
01:04:07.979 [ 1739.836395] R13: 00007ffe3e710ac0 R14: 0000000000000000 R15: 0000000000000000
01:04:07.979 [ 1739.836920] </TASK>
01:04:07.979 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/client-TCP_FASTOPEN_NO_COOKIE.pkt (ipv4-mapped-v6)]
01:04:07.980 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/server-TCP_FASTOPEN_NO_COOKIE.pkt (ipv4)]
01:04:07.980 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/server-TCP_FASTOPEN_NO_COOKIE.pkt (ipv6)]
01:04:07.980 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/server-tfo-rst-before-accept.pkt (ipv4-mapped-v6)]
01:04:07.980 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/fastopen-invalid-buf-ptr.pkt (ipv6)]
01:04:07.980 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/server-tfo-rst-before-accept.pkt (ipv4)]
01:04:07.980 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/server-TCP_FASTOPEN_NO_COOKIE.pkt (ipv4-mapped-v6)]
01:04:07.980 # OK [/opt/packetdrill/gtests/net/mptcp/fastopen/server-tfo-rst-before-accept.pkt (ipv6)]
01:04:07.980 # Ran 33 tests: 33 passing, 0 failing, 0 timed out (18.61 sec): mptcp/fastopen
this is the same as issue #321 : closing as dup the latter, as this contains more info (lockdep)
with equally low level of coffee I would say this is a false positive. mptcp_subflow_queue_clean() owns the listener's msk socket lock and calls cancel_work_sync() on an enqueued msk. Not sure how to tell it to lockdep. Perhaps changing the msk socket lock key while in the receive queue?!?
@pabeni thank you for having looked at that!
I was wondering why we started to see these warnings 3 days ago and it looks like that's because 3 days ago, we merged the PR adding TFO packetdrill tests.
What's funny is that if I remove the reproducer for #316 , I don't have the issue and if I only run this one, I don't have the issue either :-)
I can confirm the bug is not new. It is visible with the export branch from the 23rd of November, when b4f166651d03 ("mptcp: fix sleep in atomic at close time") has been applied. I cannot check without because either I have the bug from #316 or I can no longer reproduce the issue.
I don't know what's the best solution to tell lockdep everything is OK.
Not to have the failure in all commits we validate, I can revert https://github.com/multipath-tcp/packetdrill/pull/103 for the moment.
Not to have the failure in all commits we validate, I can revert https://github.com/multipath-tcp/packetdrill/pull/103 for the moment.
I just did that: https://github.com/multipath-tcp/packetdrill/pull/109
@matttbe: could you please test the following workaround?
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 6d03bdcda33e..289dd4add88c 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2363,7 +2363,7 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
/* otherwise tcp will dispose of the ssk and subflow ctx */
if (ssk->sk_state == TCP_LISTEN) {
tcp_set_state(ssk, TCP_CLOSE);
- mptcp_subflow_queue_clean(ssk);
+ mptcp_subflow_queue_clean(sk, ssk);
inet_csk_listen_stop(ssk);
mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CLOSED);
}
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index a9ff7028fad8..3cd3270407b0 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -630,7 +630,7 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
struct mptcp_subflow_context *subflow);
void __mptcp_subflow_send_ack(struct sock *ssk);
void mptcp_subflow_reset(struct sock *ssk);
-void mptcp_subflow_queue_clean(struct sock *ssk);
+void mptcp_subflow_queue_clean(struct sock *sk, struct sock *ssk);
void mptcp_sock_graft(struct sock *sk, struct socket *parent);
struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk);
bool __mptcp_close(struct sock *sk, long timeout);
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 9341f9313154..2f7830c65d95 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1778,7 +1778,7 @@ static void subflow_state_change(struct sock *sk)
}
}
-void mptcp_subflow_queue_clean(struct sock *listener_ssk)
+void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_ssk)
{
struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
struct mptcp_sock *msk, *next, *head = NULL;
@@ -1827,8 +1827,18 @@ void mptcp_subflow_queue_clean(struct sock *listener_ssk)
do_cancel_work = __mptcp_close(sk, 0);
release_sock(sk);
- if (do_cancel_work)
+ if (do_cancel_work) {
+ /* lockdep will report a false positive ABBA deadlock between
+ * cancel_work_sync and the listener socket. The involved locks
+ * belong to differen sockets WRT the existing AB chain.
+ * Using a per socket key would be very invasive and heavy, just
+ * tell lockdep to consider the listener socket released here
+ */
+ mutex_release(&listener_sk->sk_lock.dep_map, _RET_IP_);
mptcp_cancel_work(sk);
+ mutex_acquire(&listener_sk->sk_lock.dep_map,
+ SINGLE_DEPTH_NESTING, 0, _RET_IP_);
+ }
sock_put(sk);
}
also available on pastebin: https://paste.centos.org/view/dff51e32
@pabeni: thank you for having looked at this so quickly!
That looks like a cleaver idea!
And it seems to work very well: I cannot reproduce the issue any more with this patch!
Should we release/acquire the mutex only with #ifdef CONFIG_LOCKDEP
?
The CI just reported a possible circular locking dependency when validating the TFO packetdrill tests:
Note that the packetdrill tests are executed in parallel. Likely here it is caused by a test initiating the connection.
Source:
Maybe linked to #321