multipath-tcp / mptcp_net-next

Development version of the Upstream MultiPath TCP Linux kernel 🐧
https://mptcp.dev
Other
284 stars 42 forks source link

net: use-after-free in `mptcp_worker` (net/mptcp/protocol.h:528) #361

Closed matttbe closed 1 year ago

matttbe commented 1 year ago

The public CI reported issues when validating export-net (6.2) (not with export, 6.3)

+ tee /tmp/cirrus-ci-build/packetdrill_mp_capable.tap.tmp
# OK   [/opt/packetdrill/gtests/net/mptcp/mp_capable/v1_bind_tcpfallback_flagH.pkt (ipv6)]
[ 1711.721363][T10846] ==================================================================
[1711.733149][T10846] BUG: KASAN: use-after-free in mptcp_worker (net/mptcp/protocol.h:528) 
[ 1711.739924][T10846] Read of size 8 at addr ffff888009d76778 by task kworker/0:2/10846
[ 1711.746745][T10846] 
[ 1711.748896][T10846] CPU: 0 PID: 10846 Comm: kworker/0:2 Tainted: G                 N 6.2.0-rc8-g791be828903c #1
[ 1711.757984][T10846] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 1711.766816][T10846] Workqueue: events mptcp_worker
[ 1711.771293][T10846] Call Trace:
[ 1711.774615][T10846]  <TASK>
[1711.777483][T10846] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) 
[1711.781545][T10846] print_address_description.constprop.0 (mm/kasan/report.c:307) 
[1711.787801][T10846] print_report (mm/kasan/report.c:418) 
[1711.792301][T10846] ? kasan_addr_to_slab (arch/x86/include/asm/bitops.h:207) 
[1711.797839][T10846] ? mptcp_worker (net/mptcp/protocol.h:528) 
[1711.801975][T10846] kasan_report (mm/kasan/report.c:184) 
[1711.805724][T10846] ? mptcp_worker (net/mptcp/protocol.h:528) 
[1711.809221][T10846] mptcp_worker (net/mptcp/protocol.h:528) 
[1711.813130][T10846] ? rcu_read_unlock (include/linux/rcupdate.h:793 (discriminator 5)) 
[1711.816527][T10846] ? debug_object_active_state (lib/debugobjects.c:728) 
[1711.821512][T10846] ? mptcp_release_cb (net/mptcp/protocol.c:2613) 
[1711.825664][T10846] process_one_work (kernel/workqueue.c:2294) 
[1711.829537][T10846] ? rcu_read_unlock (include/linux/rcupdate.h:793 (discriminator 5)) 
[1711.832837][T10846] ? pwq_dec_nr_in_flight (kernel/workqueue.c:2184) 
[1711.836713][T10846] ? rwlock_bug.part.0 (kernel/locking/spinlock_debug.c:113) 
[1711.841128][T10846] ? _raw_spin_lock_irq (include/linux/spinlock_api_smp.h:117) 
[1711.844481][T10846] worker_thread (include/linux/list.h:292) 
[1711.847586][T10846] ? process_one_work (kernel/workqueue.c:2379) 
[1711.851488][T10846] kthread (kernel/kthread.c:376) 
[1711.854541][T10846] ? kthread_complete_and_exit (kernel/kthread.c:331) 
[1711.858686][T10846] ret_from_fork (arch/x86/entry/entry_64.S:314) 
[ 1711.862641][T10846]  </TASK>
[ 1711.865737][T10846] 
[ 1711.868132][T10846] Allocated by task 16172:
[1711.871577][T10846] kasan_save_stack (mm/kasan/common.c:46) 
[1711.874773][T10846] kasan_set_track (mm/kasan/common.c:52) 
[1711.878197][T10846] __kasan_kmalloc (mm/kasan/common.c:384) 
[1711.882406][T10846] __kmalloc (mm/slab_common.c:969) 
[1711.885246][T10846] sk_prot_alloc (include/linux/slab.h:584) 
[1711.888545][T10846] sk_clone_lock (net/core/sock.c:2244) 
[1711.892120][T10846] inet_csk_clone_lock (net/ipv4/inet_connection_sock.c:1120) 
[1711.896094][T10846] tcp_create_openreq_child (net/ipv4/tcp_minisocks.c:491) 
[1711.900578][T10846] tcp_v4_syn_recv_sock (net/ipv4/tcp_ipv4.c:1570) 
[1711.904577][T10846] tcp_v6_syn_recv_sock (net/ipv6/tcp_ipv6.c:1221) 
[1711.908552][T10846] subflow_syn_recv_sock (net/mptcp/subflow.c:786) 
[1711.912314][T10846] tcp_check_req (net/ipv4/tcp_minisocks.c:805) 
[1711.915948][T10846] tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2074) 
[1711.919690][T10846] ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205) 
[1711.924345][T10846] ip_local_deliver_finish (include/linux/rcupdate.h:793) 
[1711.928477][T10846] ip_local_deliver (include/linux/netfilter.h:302) 
[1711.932266][T10846] ip_rcv (include/linux/netfilter.h:302) 
[1711.935137][T10846] __netif_receive_skb_one_core (net/core/dev.c:5467) 
[1711.939427][T10846] netif_receive_skb_internal (net/core/dev.c:5674) 
[1711.943881][T10846] netif_receive_skb (net/core/dev.c:5733) 
[1711.947472][T10846] tun_rx_batched (include/linux/bottom_half.h:33) 
[1711.950973][T10846] tun_get_user (drivers/net/tun.c:1983) 
[1711.954564][T10846] tun_chr_write_iter (drivers/net/tun.c:863) 
[1711.958560][T10846] do_iter_readv_writev (fs/read_write.c:736) 
[1711.962408][T10846] do_iter_write (fs/read_write.c:861) 
[1711.966833][T10846] vfs_writev (fs/read_write.c:935) 
[1711.969885][T10846] do_writev (fs/read_write.c:977) 
[1711.973027][T10846] do_syscall_64 (arch/x86/entry/common.c:50) 
[1711.975993][T10846] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) 
[ 1711.979791][T10846] 
[ 1711.981927][T10846] Freed by task 16172:
[1711.985603][T10846] kasan_save_stack (mm/kasan/common.c:46) 
[1711.989198][T10846] kasan_set_track (mm/kasan/common.c:52) 
[1711.992958][T10846] kasan_save_free_info (mm/kasan/generic.c:525) 
[1711.996921][T10846] ____kasan_slab_free (mm/kasan/common.c:238) 
[1712.001111][T10846] slab_free_freelist_hook (mm/slub.c:1807) 
[1712.004917][T10846] __kmem_cache_free (mm/slub.c:3787) 
[1712.008724][T10846] __sk_destruct (net/core/sock.c:2076) 
[1712.012599][T10846] inet_csk_listen_stop (include/net/sock.h:1991) 
[1712.016915][T10846] __mptcp_close_ssk (net/mptcp/protocol.c:2361) 
[1712.021374][T10846] mptcp_destroy_common (net/mptcp/protocol.c:3184) 
[1712.025756][T10846] mptcp_destroy (include/net/sock.h:1499) 
[1712.029920][T10846] __mptcp_destroy_sock (net/mptcp/protocol.c:2886) 
[1712.034116][T10846] __mptcp_close (net/mptcp/protocol.c:2968) 
[1712.038403][T10846] mptcp_close (net/mptcp/protocol.c:2983) 
[1712.041889][T10846] inet_release (net/ipv4/af_inet.c:433) 
[1712.046195][T10846] __sock_release (net/socket.c:651) 
[1712.050281][T10846] sock_close (net/socket.c:1370) 
[1712.054099][T10846] __fput (fs/file_table.c:320) 
[1712.057595][T10846] task_work_run (kernel/task_work.c:181 (discriminator 1)) 
[1712.062072][T10846] exit_to_user_mode_prepare (include/linux/resume_user_mode.h:49) 
[1712.066792][T10846] syscall_exit_to_user_mode (kernel/entry/common.c:130) 
[1712.071721][T10846] do_syscall_64 (arch/x86/entry/common.c:87) 
[1712.076264][T10846] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) 
[ 1712.081109][T10846] 

It seems reproducible as it happened 2 days in a row:

It seems to be linked to the two patches from @pabeni I recently applied:

I had some conflicts that I probably didn't resolve properly (or maybe other adaptations needed for -net?)

https://lore.kernel.org/mptcp/2cb7f6a5-5ecf-ec8a-c734-87c9fbdd1791@gmail.com/T/#mc2c87bcd67a993673f8ff66b259bba449d753d30

matttbe commented 1 year ago

It looks like the two patches depend on mptcp: refactor passive socket initialization patch that is in our export branch (only). Paolo suggested to move these two patches to the export branch instead of export-net. This is what I did below.


Revert from -net side:

New patches for t/upstream-net:

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20230220T125607


Re-adding in net-next side before:

mptcp: avoid unneeded __mptcp_nmpc_socket() usage

New patches for t/upstream:

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20230220T135733

Cheers, Matt