iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.61k stars 3.89k forks source link

bcc/tools/memleak not working with userland processes #1535

Open jonesmz opened 6 years ago

jonesmz commented 6 years ago

If I run "/usr/share/bcc/tools/memleak" without arguments, it seems to work fine with regards to attaching to the kernel and tracing allocations.

But If I run /usr/share/bcc/tools/memleak -c ls, i get error output.

/usr/share/bcc/tools/memleak -c ls
Executing 'ls' and tracing the resulting process.
arch   certs    CREDITS  Documentation  firmware  include  ipc     Kconfig  lib          Makefile  modules.order  README          samples  security  tools  virt
block  COPYING  crypto   drivers        fs        init     Kbuild  kernel   MAINTAINERS  mm        net            REPORTING-BUGS  scripts  sound     usr
Attaching to pid 24202, Ctrl+C to quit.
bpf: Failed to load program: Invalid argument

Traceback (most recent call last):
  File "/usr/share/bcc/tools/memleak", line 409, in <module>
    attach_probes("malloc")
  File "/usr/share/bcc/tools/memleak", line 399, in attach_probes
    pid=pid)
  File "/usr/lib/python2.7/site-packages/bcc/__init__.py", line 830, in attach_uprobe
    fn = self.load_func(fn_name, BPF.KPROBE)
  File "/usr/lib/python2.7/site-packages/bcc/__init__.py", line 348, in load_func
    (func_name, errstr))
Exception: Failed to load BPF program malloc_enter: Invalid argument

This is kernel 4.9.75, for what it's worth.

Notably, I apparently have CONFIG_HAVE_EBPF_JIT instead of CONFIG_HAVE_BPF_JIT. Would that cause this?

I can supply the .config file if needed.

BTW, I tested with both the 5.0 release, as well as the latest git revision.

yonghong-song commented 6 years ago

Looks like this is due to the process "ls" is gone too fast and at the time attach_uprobe is called, the process is gone, and the attachment failed:

open(/proc/<pid>/ns/mnt): No such file or directory
perf_event_open(/sys/kernel/debug/tracing/events/uprobes/p__lib64_libc_so_6_0x8cc40_3112_bcc_3111/id): No such process

Could you try some long-running processes?

jonesmz commented 6 years ago
sudo /usr/share/bcc/tools/memleak -p 795
Attaching to pid 795, Ctrl+C to quit.
bpf: Failed to load program: Invalid argument

Traceback (most recent call last):
  File "/usr/share/bcc/tools/memleak", line 409, in <module>
    attach_probes("malloc")
  File "/usr/share/bcc/tools/memleak", line 399, in attach_probes
    pid=pid)
  File "/usr/lib/python2.7/site-packages/bcc/__init__.py", line 830, in attach_uprobe
    fn = self.load_func(fn_name, BPF.KPROBE)
  File "/usr/lib/python2.7/site-packages/bcc/__init__.py", line 348, in load_func
    (func_name, errstr))
Exception: Failed to load BPF program malloc_enter: Invalid argument
ps aux | grep 795
root       795  0.0  0.0  22652  1816 ?        Ss   Jan12   0:00 /usr/sbin/crond -n
admin     4143  0.0  0.0 112676  2312 pts/1    S+   22:42   0:00 grep --color=auto 795
jonesmz commented 6 years ago

I don't have a /sys/kernel/debug/tracing/events/uprobes folder.

Is that an indicator of bad kernel config options?

yonghong-song commented 6 years ago

No, no /sys/kernel/debug/tracing/events/uprobes does not mean uprobe is not disabled in kernel as it will be created dynamically.

Please do make sure

CONFIG_UPROBES=y

in your kernel config file.

jonesmz commented 6 years ago

CONFIG_UPROBES=y is set in my kernel config.

Any other debugging steps I can run?

yonghong-song commented 6 years ago

Does it work with any uprobe at all, e.g., trace.py 'c:malloc'? Maybe this is file system issue (e.g., overlay file system won't support uprobe)?

jonesmz commented 6 years ago

Hrmm, doesn't look like that works.

$ /usr/share/bcc/tools/trace 'c:malloc' bpf: Failed to load program: Invalid argument

Failed to load BPF program probe_malloc_1: Invalid argument

I'm not using any exotic filesystem setup.

Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/vda1 ext4 41152832 15998972 23040376 41% / devtmpfs devtmpfs 10241740 0 10241740 0% /dev tmpfs tmpfs 10277396 0 10277396 0% /dev/shm tmpfs tmpfs 10277396 33256 10244140 1% /run tmpfs tmpfs 10277396 0 10277396 0% /sys/fs/cgroup tmpfs tmpfs 2055480 0 2055480 0% /run/user/1000

yonghong-song commented 6 years ago

Just to confirm. kprobe and kernel tracepoint working fine, right?

The error message indicates that bpf syscall with command BPF_PROG_LOAD failed with -EINVAL return value in the kernel.

Could you run the following to at least confirm that bpf syscall is reached?

echo 'p:trace_SyS_bpf SyS_bpf' > /sys/kernel/debug/tracing/kprobe_events 
echo 1 > /sys/kernel/debug/tracing/events/kprobes/trace_SyS_bpf/enable
cat /sys/kernel/debug/tracing/trace_pipe

In a different window, run your command like "trace.py 'c:malloc'". Any output from trace_pipe at least indicates that bpf syscall is reached.

If bpf syscall is indeed called, the following are a few places where -EINVAL could return (incomplete lists):

        if (CHECK_ATTR(BPF_PROG_LOAD))
                return -EINVAL;

        if (attr->prog_flags & ~BPF_F_STRICT_ALIGNMENT)
                return -EINVAL;

        if (type == BPF_PROG_TYPE_KPROBE &&
            attr->kern_version != LINUX_VERSION_CODE)
                return -EINVAL;

        if (type >= ARRAY_SIZE(bpf_prog_types) || !bpf_prog_types[type])
                return -EINVAL;

static int bpf_obj_name_cpy(char *dst, const char *src)
{
        const char *end = src + BPF_OBJ_NAME_LEN;

        memset(dst, 0, BPF_OBJ_NAME_LEN);

        /* Copy all isalnum() and '_' char */
        while (src < end && *src) {
                if (!isalnum(*src) && *src != '_')
                        return -EINVAL;
                *dst++ = *src++;
        }

        /* No '\0' found in BPF_OBJ_NAME_LEN number of bytes */
        if (src == end)
                return -EINVAL;

        return 0;
}    

It may be worthwhile to print out all the syscall parameters to confirm that they are all in good shape.

jonesmz commented 6 years ago

Thanks so much for continuing to help! This is my first time trying something like bcc, so I'm a bit inexperienced here.

I was able to use the bcc memleak tool out of the box on my Gentoo system, I'm only having trouble with this centos7 system.

Here's the results from the trace command:

localhost:.../modules/4.9.75-edge_boot/build $ echo 'p:trace_SyS_bpf SyS_bpf' > /sys/kernel/debug/tracing/kprobe_events
   localhost:.../modules/4.9.75-edge_boot/build $ echo 1 > /sys/kernel/debug/tracing/events/kprobes/trace_SyS_bpf/enable
   localhost:.../modules/4.9.75-edge_boot/build $ cat /sys/kernel/debug/tracing/trace_pipe
          python-24607 [009] d... 374321.933543: trace_SyS_bpf: (SyS_bpf+0x0/0x880)
          python-24607 [009] d... 374321.933573: trace_SyS_bpf: (SyS_bpf+0x0/0x880)
          python-24607 [009] d... 374322.609852: trace_SyS_bpf: (SyS_bpf+0x0/0x880)
          python-24607 [009] d... 374322.609874: trace_SyS_bpf: (SyS_bpf+0x0/0x880)
          python-24607 [009] d... 374322.609877: trace_SyS_bpf: (SyS_bpf+0x0/0x880)
jonesmz commented 6 years ago

With your comment about printing the syscall params, I take it you mean modifying the code of tracing.py to print out the parameters being passed in to the various calls?

yonghong-song commented 6 years ago

To be precise, you may need to build bcc from source (see INSTALL.md), and then modify libbpf.c for bpf syscall with BPF_LOAD command to check all attributes to see whether they are making sense.

jonesmz commented 6 years ago

Adding

fprintf(stderr, "DEBUG: program name : %s, BPF_OBJ_NAME_LEN-1 : %d, name_len : %d, # insns : %u, license : %s, kern_version : %u, LINUX_VERSION_CODE : %u, log_level : %d, log : %s\n", name, BPF_OBJ_NAME_LEN-1, name_len, attr.insn_cnt, license, kern_version, LINUX_VERSION_CODE, log_level, log_buf);

to bpf_prog_load

I see that

DEBUG: program name : probe_malloc_1, BPF_OBJ_NAME_LEN-1 : 15, name_len : 14, # insns : 25, license : GPL, kern_version : 264524, LINUX_VERSION_CODE : 199168, log_level : 0, log : (null)

The kern_version, and the LINUX_VERSION_CODE aren't matching.

After consulting with my coworker, it looks like the source code provided isn't what was used to build the kernel.

I've installed the correct version, and now the program is working.

Is this something that the bpf_prog_load function should check for directly? It's an easy check and could save a lot of confusion.

palmtenor commented 6 years ago

@jonesmz Just to confirm, you meant that the /modules/4.9.75-edge_boot/build used by BCC does not contain the actual source code used to build the Kernel, right?

jonesmz commented 6 years ago

That's correct, I was mistakenly using 4.9.76, but the kernel that I have on this system was built from 4.9.75. After downloading 4.9.75 and recompiling to get the autoconf.h file, the BCC tools (trace, memleak) are working how I would expect.

That being said, I noticed that my debug statements still show a mismatch between LINUX_VERSION_CODE and kern_version, so that was apparently not a good diagnostic to use in bpf_prog_load, even though it accidentally led me to a solution.

yonghong-song commented 6 years ago

Maybe we could add a warning in bpf_module.cc where after successful compilation (load_c and load_string), we can do a sanity check comparing module kern_version() vs. underlying system kernel version and WARN if there is any mismatch. This will be a problem for kprobe/uprobe and may not be an issue for others. But in any cases, it is a good thing to warn users about this.

Could you craft a patch for this?

yonghong-song commented 6 years ago

Actually, clang frontend already uses uname syscall to get the underlying kernel version and use that to find kernel headers. You can use it as an example. On the second thought, maybe we want to check at bpf_prog_load time since at that time we have the kern_version number which is set to attr. It is obvious that kern_version we are checking is going to the kernel.