iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.51k stars 3.87k forks source link

tools/tcpaccept.py's kretprobes not triggered due to low maxactive #1072

Closed alban closed 5 years ago

alban commented 7 years ago

When a kretprobe is installed on a kernel function, there is a maximum limit of how many calls in parallel it can catch (aka "maxactive"). In the case of a eBPF kretprobe, the maxactive is let to the default as defined in kernel/kprobes.c:

        /* Pre-allocate memory for max kretprobe instances */
        if (rp->maxactive <= 0) {
#ifdef CONFIG_PREEMPT
                rp->maxactive = max_t(unsigned int, 10, 2*num_possible_cpus());
#else
                rp->maxactive = num_possible_cpus();
#endif
        }

The maxactive can be as low as 1 on single-cores on kernel without CONFIG_PREEMPT.

tools/tcpaccept.py installs a kretprobe on inet_csk_accept() but that function can sleep for a long time (waiting for an incoming connection).

How to reproduce:

  1. Start the tracer: sudo ./tools/tcpaccept.py
  2. In another shell, start a bunch of processes waiting on the accept() system call: for i in $(seq 1 25) ; do (busybox nc -l -p $((8081 + i)) &) ; done
  3. Start nginx locally: sudo docker run -d nginx
  4. Connect to nginx: curl 172.17.0.2
  5. Notice that there is no "accept" event

Note: use nc from busybox to reproduce this issue. nc from https://nmap.org/ncat (the default on Fedora) will not reproduce the issue because it does not block on the accept() system call (it waits on the select() system call to know when there is an incoming connection and only start the accept() system call at that time).

Similar issue also reported on https://github.com/weaveworks/tcptracer-bpf/issues/24 & https://github.com/weaveworks/tcptracer-bpf/issues/34

goldshtn commented 7 years ago

This is in fact looks like a potential issue for any of our tool that uses kprobes/uprobes. @4ast @brendangregg

alban commented 7 years ago

When setting up a kretprobe from a kernel module, it's possible to set up the maxactive to a different value.

But when creating a kretprobe by writing a command to /sys/kernel/debug/tracing/kprobe_events, the maxactive cannot be specified there. Could the command parser for kprobe_events be updated to accept a maxactive= parameter?

I also got this issue on another kernel function that (AFAIK) does not sleep and I don't understand why: https://lists.iovisor.org/pipermail/iovisor-dev/2017-March/000694.html

alban commented 7 years ago

I've tested a patch to make maxactive configurable in /sys/kernel/debug/tracing/kprobe_events: https://github.com/kinvolk/linux/commit/3c713f1ba24255f0537f283ea6cab2efc435903a

Tested with https://github.com/kinvolk/gobpf/commit/e21cc337eee96281475b9dde4d961a975325670b

Do you think this approach would be acceptable in upstream? And do you think there is a workaround possible on current kernels without such a patch?

4ast commented 7 years ago

the patch looks good to me. pls send it upstream and explain in commit log that we need to increase maxactive not only for recursive functions, but for functions that sleep or resched.

alban commented 7 years ago

Expected to land in Linux 4.12.

alban commented 7 years ago

We make use of this kernel patch via:

bcc could do something similar for tools/tcpaccept.py