multipath-tcp / mptcp_net-next

Development version of the Upstream MultiPath TCP Linux kernel 🐧
https://mptcp.dev
Other
284 stars 42 forks source link

[syzkaller] INFO: rcu detected stall in ip_rcv #32

Closed cpaasch closed 4 years ago

cpaasch commented 4 years ago

(don't remember the HEAD I am currently running syzkaller on :-/ )

rcu: INFO: rcu_sched self-detected stall on CPU
rcu:    0-....: (1 GPs behind) idle=e3a/1/0x4000000000000002 softirq=7243/7245 fqs=23773 
    (t=100000 jiffies g=13181 q=112)
NMI backtrace for cpu 0
CPU: 0 PID: 2584 Comm: syz-executor420 Not tainted 5.7.0-rc6 #84
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xb7/0xfe lib/dump_stack.c:118
 nmi_cpu_backtrace.cold+0x19/0x84 lib/nmi_backtrace.c:101
 nmi_trigger_cpumask_backtrace+0x193/0x198 lib/nmi_backtrace.c:62
 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
 rcu_dump_cpu_stacks+0xed/0x130 kernel/rcu/tree_stall.h:254
 print_cpu_stall kernel/rcu/tree_stall.h:475 [inline]
 check_cpu_stall kernel/rcu/tree_stall.h:549 [inline]
 rcu_pending kernel/rcu/tree.c:3225 [inline]
 rcu_sched_clock_irq.cold+0x310/0x57c kernel/rcu/tree.c:2296
 update_process_times+0x25/0x60 kernel/time/timer.c:1726
 tick_sched_handle+0x63/0xe0 kernel/time/tick-sched.c:176
 tick_sched_timer+0x3e/0xd0 kernel/time/tick-sched.c:1320
 __run_hrtimer kernel/time/hrtimer.c:1520 [inline]
 __hrtimer_run_queues+0x247/0x590 kernel/time/hrtimer.c:1584
 hrtimer_interrupt+0x1e6/0x3f0 kernel/time/hrtimer.c:1646
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline]
 smp_apic_timer_interrupt+0x86/0x1e0 arch/x86/kernel/apic/apic.c:1138
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:26 [inline]
RIP: 0010:check_kcov_mode kernel/kcov.c:153 [inline]
RIP: 0010:write_comp_data+0x9/0x70 kernel/kcov.c:208
Code: 80 a4 08 00 00 48 8b 11 48 83 c2 01 48 39 d0 76 07 48 89 34 d1 48 89 11 c3 0f 1f 84 00 00 00 00 00 65 4c 8b 04 25 00 0d 02 00 <65> 8b 05 88 99 dd 7e a9 00 01 1f 00 75 51 41 8b 80 a0 08 00 00 83
RSP: 0018:ffff88811b409390 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000000 RBX: ffff88811057cd68 RCX: ffffffff8274917a
RDX: 0000000051910d22 RSI: 00000000928d5e77 RDI: 0000000000000004
RBP: 00000000928d5e77 R08: ffff8881195aaa00 R09: ffffed102368126f
R10: 0000000000000003 R11: ffffed102368126e R12: 0000000051910d22
R13: ffff88811057cd00 R14: ffff888110412638 R15: ffff888110412630
 __token_lookup_msk net/mptcp/token.c:75 [inline]
 __token_bucket_busy+0xea/0x150 net/mptcp/token.c:83
 mptcp_token_new_request+0x98/0x230 net/mptcp/token.c:115
 subflow_init_req+0x1c8/0x6f0 net/mptcp/subflow.c:157
 tcp_conn_request+0x6a7/0x15e0 net/ipv4/tcp_input.c:6653
 subflow_v4_conn_request+0x60/0x90 net/mptcp/subflow.c:316
 tcp_rcv_state_process+0x638/0x25aa net/ipv4/tcp_input.c:6195
 tcp_v4_do_rcv+0x1ed/0x480 net/ipv4/tcp_ipv4.c:1650
 tcp_v4_rcv+0x1b67/0x1c00 net/ipv4/tcp_ipv4.c:1998
 ip_protocol_deliver_rcu+0x42/0x380 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0xc3/0xe0 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_local_deliver+0x162/0x220 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:441 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
 ip_rcv_finish+0x79/0x90 net/ipv4/ip_input.c:414
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_rcv+0x19d/0x1b0 net/ipv4/ip_input.c:539
 __netif_receive_skb_one_core+0x156/0x180 net/core/dev.c:5268
 __netif_receive_skb+0x29/0xd0 net/core/dev.c:5382
 process_backlog+0x133/0x2d0 net/core/dev.c:6214
 napi_poll net/core/dev.c:6659 [inline]
 net_rx_action+0x2c0/0x7b0 net/core/dev.c:6727
 __do_softirq+0x10d/0x3be kernel/softirq.c:292
 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
 </IRQ>
 do_softirq.part.0+0x26/0x30 kernel/softirq.c:337
 do_softirq arch/x86/include/asm/preempt.h:26 [inline]
 __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:189
 local_bh_enable include/linux/bottom_half.h:32 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:690 [inline]
 ip_finish_output2+0x4a9/0xd60 net/ipv4/ip_output.c:229
 __ip_finish_output net/ipv4/ip_output.c:306 [inline]
 __ip_finish_output+0x1dc/0x420 net/ipv4/ip_output.c:288
 ip_finish_output net/ipv4/ip_output.c:316 [inline]
 NF_HOOK_COND include/linux/netfilter.h:296 [inline]
 ip_output+0x12b/0x240 net/ipv4/ip_output.c:430
 dst_output include/net/dst.h:435 [inline]
 ip_local_out+0x6b/0x80 net/ipv4/ip_output.c:125
 __ip_queue_xmit+0x372/0x9b0 net/ipv4/ip_output.c:530
 __tcp_transmit_skb+0xdb6/0x1a60 net/ipv4/tcp_output.c:1238
 tcp_transmit_skb net/ipv4/tcp_output.c:1254 [inline]
 tcp_connect+0x1281/0x1820 net/ipv4/tcp_output.c:3671
 tcp_v4_connect+0xb02/0xc50 net/ipv4/tcp_ipv4.c:311
 __inet_stream_connect+0x227/0x7f0 net/ipv4/af_inet.c:658
 inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:722
 mptcp_stream_connect+0x12e/0x220 net/mptcp/protocol.c:1807
 __sys_connect_file+0xcf/0xe0 net/socket.c:1854
 __sys_connect+0x160/0x190 net/socket.c:1871
 __do_sys_connect net/socket.c:1882 [inline]
 __se_sys_connect net/socket.c:1879 [inline]
 __x64_sys_connect+0x3e/0x50 net/socket.c:1879
 do_syscall_64+0x8a/0x290 arch/x86/entry/common.c:295
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fbd64788469
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffe13606918 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 000000000005638d RCX: 00007fbd64788469
RDX: 000000000000004d RSI: 0000000020000040 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000400f90 R09: 0000000000400f90
R10: 0000000000400f90 R11: 0000000000000246 R12: 0000000000400bb2
R13: 00007ffe13606a20 R14: 0000000000000000 R15: 0000000000000000

syz-repro:

# {Threaded:false Collide:false Repeat:true RepeatTimes:0 Procs:1 Sandbox: Fault:false FaultCall:-1 FaultNth:0 Leak:false NetInjection:false NetDevices:false NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false UseTmpDir:false HandleSegv:false Repro:false Trace:false}
r0 = socket$inet_mptcp(0x2, 0x1, 0x106)
r1 = socket$inet_mptcp(0x2, 0x1, 0x106)
bind$inet(r1, &(0x7f00000013c0)={0x2, 0x4e20, @multicast2}, 0x10)
connect$inet(r1, &(0x7f0000000040)={0x2, 0x0, @loopback}, 0x10)
listen(r1, 0x3)
connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @loopback}, 0x4d)

C-repro:

// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <dirent.h>
#include <endian.h>
#include <errno.h>
#include <fcntl.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/prctl.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

static void sleep_ms(uint64_t ms)
{
  usleep(ms * 1000);
}

static uint64_t current_time_ms(void)
{
  struct timespec ts;
  if (clock_gettime(CLOCK_MONOTONIC, &ts))
    exit(1);
  return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
}

static bool write_file(const char* file, const char* what, ...)
{
  char buf[1024];
  va_list args;
  va_start(args, what);
  vsnprintf(buf, sizeof(buf), what, args);
  va_end(args);
  buf[sizeof(buf) - 1] = 0;
  int len = strlen(buf);
  int fd = open(file, O_WRONLY | O_CLOEXEC);
  if (fd == -1)
    return false;
  if (write(fd, buf, len) != len) {
    int err = errno;
    close(fd);
    errno = err;
    return false;
  }
  close(fd);
  return true;
}

static void kill_and_wait(int pid, int* status)
{
  kill(-pid, SIGKILL);
  kill(pid, SIGKILL);
  int i;
  for (i = 0; i < 100; i++) {
    if (waitpid(-1, status, WNOHANG | __WALL) == pid)
      return;
    usleep(1000);
  }
  DIR* dir = opendir("/sys/fs/fuse/connections");
  if (dir) {
    for (;;) {
      struct dirent* ent = readdir(dir);
      if (!ent)
        break;
      if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
        continue;
      char abort[300];
      snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
               ent->d_name);
      int fd = open(abort, O_WRONLY);
      if (fd == -1) {
        continue;
      }
      if (write(fd, abort, 1) < 0) {
      }
      close(fd);
    }
    closedir(dir);
  } else {
  }
  while (waitpid(-1, status, __WALL) != pid) {
  }
}

static void setup_test()
{
  prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
  setpgrp();
  write_file("/proc/self/oom_score_adj", "1000");
}

static void execute_one(void);

#define WAIT_FLAGS __WALL

static void loop(void)
{
  int iter;
  for (iter = 0;; iter++) {
    int pid = fork();
    if (pid < 0)
      exit(1);
    if (pid == 0) {
      setup_test();
      execute_one();
      exit(0);
    }
    int status = 0;
    uint64_t start = current_time_ms();
    for (;;) {
      if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
        break;
      sleep_ms(1);
      if (current_time_ms() - start < 5 * 1000)
        continue;
      kill_and_wait(pid, &status);
      break;
    }
  }
}

uint64_t r[2] = {0xffffffffffffffff, 0xffffffffffffffff};

void execute_one(void)
{
  intptr_t res = 0;
  res = syscall(__NR_socket, 2ul, 1ul, 0x106);
  if (res != -1)
    r[0] = res;
  res = syscall(__NR_socket, 2ul, 1ul, 0x106);
  if (res != -1)
    r[1] = res;
  *(uint16_t*)0x200013c0 = 2;
  *(uint16_t*)0x200013c2 = htobe16(0x4e20);
  *(uint32_t*)0x200013c4 = htobe32(0xe0000002);
  syscall(__NR_bind, r[1], 0x200013c0ul, 0x10ul);
  *(uint16_t*)0x20000040 = 2;
  *(uint16_t*)0x20000042 = htobe16(0);
  *(uint32_t*)0x20000044 = htobe32(0x7f000001);
  syscall(__NR_connect, r[1], 0x20000040ul, 0x10ul);
  syscall(__NR_listen, r[1], 3);
  *(uint16_t*)0x20000040 = 2;
  *(uint16_t*)0x20000042 = htobe16(0x4e20);
  *(uint32_t*)0x20000044 = htobe32(0x7f000001);
  syscall(__NR_connect, r[0], 0x20000040ul, 0x4dul);
}
int main(void)
{
  syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 3ul, 0x32ul, -1, 0ul);
  loop();
  return 0;
}
cpaasch commented 4 years ago

Even with the change that adds mptcp_clear_token I am still able to trigger this crasher here.

cpaasch commented 4 years ago

Repro'd on top of:

c004f2d61be9 ("Squash-to: "mptcp: refactor token container."")  (HEAD -> netnext_mptcp_all) (13 minutes ago) <Paolo Abeni>
2ce39af1418f ("mptcp: Enable MPTCP when IPPROTO_MPTCP is set")  (13 minutes ago) <Christoph Paasch>
f29baf817fb1 ("Cleanup")  (13 minutes ago) <Christoph Paasch>
644b0c37d9dd ("Paolos pastebin")  (13 minutes ago) <Christoph Paasch>
1b4a0e116b05 ("FIX inet_csk_prepare_for_destroy_sock")  (13 minutes ago) <Christoph Paasch>
1ee018877415 ("net: mptcp: improve fallback to TCP")  (13 minutes ago) <Davide Caratti>
26bf539be27f ("mptcp: add receive buffer auto-tuning")  (13 minutes ago) <Florian Westphal>
7840fd98f417 ("[DO-NOT-MERGE] mptcp: enabled by default")  (13 minutes ago) <Matthieu Baerts>
3ae48bf92e11 ("mptcp: introduce token KUNIT self-tests")  (13 minutes ago) <Paolo Abeni>
7531bc083b93 ("mptcp: move crypto test to KUNIT")  (13 minutes ago) <Paolo Abeni>
3ed93a4e608c ("mptcp: refactor token container.")  (2 hours ago) <Paolo Abeni>
c8328b91815c ("mptcp: add __init annotation on setup functions")  (2 hours ago) <Paolo Abeni>
1806c13dc253 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")  (netnext/master, netnext_net) (18 hours ago) <David S. Miller>
pabeni commented 4 years ago

Still no clue here. What I observe:

When the issue/race is triggered, at mptcp_close() time an MPTCP listener socket is 'hashed' (that is !!sk_node.pprev, supposedly inserted into the token container), but can't be found into the relevant bucket.

The previous listen() call completed with the mentioned socket being correctly 'unhashed'.

No idea who/how flipped sk_node.pprev meanwhile

cpaasch commented 4 years ago

Can't repro anymore with HEAD:

e7d05321339a ("add mptcp_token_destroy") (HEAD) (11 seconds ago) 76646ab2dfd0 ("Squash-to: "mptcp: refactor token container."") (25 hours ago) d7414e0a5c73 ("Cleanup") (25 hours ago) a3ca6689d434 ("Paolos pastebin") (25 hours ago) ccc3b33a447d ("FIX inet_csk_prepare_for_destroy_sock") (25 hours ago) 446548400687 ("net: mptcp: improve fallback to TCP") (25 hours ago) 671015c089ce ("mptcp: add receive buffer auto-tuning") (25 hours ago) ffa63eb ("[DO-NOT-MERGE] mptcp: enabled by default") (tag: export/20200603T083508, mptcp_net-next/export) (32 hours ago) 21de248 ("mptcp: introduce token KUNIT self-tests") (32 hours ago) 25494d4 ("mptcp: move crypto test to KUNIT") (32 hours ago) 2975319 ("mptcp: refactor token container.") (32 hours ago) 8368186 ("mptcp: add __init annotation on setup functions") (32 hours ago) 556f751 ("bpf: fix unused-var without NETDEVICES") (32 hours ago) 065fcfd ("selftests: net: ip_defrag: ignore EPERM") (netnext/master, mptcp_net-next/net-next) (2 days ago)

cpaasch commented 4 years ago

Closing as it does not reproduce anymore.