falcosecurity / falco

Cloud Native Runtime Security
https://falco.org
Apache License 2.0
7.36k stars 902 forks source link

falco --modern-bpf fail (libbpf: failed to find valid kernel BTF) #2357

Closed petterreinholdtsen closed 1 year ago

petterreinholdtsen commented 1 year ago

Describe the bug

After managing to get the build limping along with some tape and chewing gum, as described in issue #2343, I got to a stage where I can test the resulting .deb. Sadly it fail to load like this:

Mon Jan 16 10:58:26 2023: Falco version: 0.33.1-1 (x86_64)
Mon Jan 16 10:58:26 2023: Falco initialized with configuration file: /etc/falco/falco.yaml
Mon Jan 16 10:58:26 2023: Loading rules from file /etc/falco/falco_rules.yaml
Mon Jan 16 10:58:26 2023: Loading rules from file /etc/falco/falco_rules.local.yaml
Mon Jan 16 10:58:26 2023: Loading rules from file /etc/falco/rules.d/nidhogg.yml
Rules match ignored syscall: warning (ignored-evttype):
Loaded rules match the following events: ppoll, semop, getdents, signaldeliver, getresuid, getegid, geteuid, getuid, sendfile, getresgid, pwrite, preadv, page_fault, pwritev, munlock, sendmmsg, io_uring_enter, fstat64, mlock2, getdents64, mlock, mlockall, fsconfig, select, copy_file_range, io_uring_register, getcwd, mmap2, mprotect, send, writev, recvmmsg, lseek, poll, munmap, llseek, epoll_wait, stat64, access, fstat, lstat, stat, futex, lstat64, pluginevent, getpeername, semget, write, brk, getsockname, pread, setsockopt, recv, getgid, nanosleep, readv, getrlimit, switch, semctl, munlockall, mmap, splice, read
But these events are not returned unless running falco with -A
Mon Jan 16 10:58:26 2023: The chosen syscall buffer dimension is: 8388608 bytes (8 MBs)
Mon Jan 16 10:58:26 2023: Starting health webserver with threadiness 1, listening on port 8765
Mon Jan 16 10:58:26 2023: Enabled event sources: syscall
Mon Jan 16 10:58:26 2023: Opening capture with modern BPF probe
libbpf: failed to find valid kernel BTF
libbpf: Error loading vmlinux BTF: -3
libbpf: failed to load object 'bpf_probe'
libbpf: failed to load BPF skeleton 'bpf_probe': -3
libpman: failed to load BPF object (errno: 3 | message: No such process)
Mon Jan 16 10:58:26 2023: An error occurred in an event source, forcing termination...
Error: 
Events detected: 0
Rule counts by severity:
Triggered rules by rule name:

After some web searches and stracing, I suspect this is because the code fail to find the Linux kernel in boot. The code look for /boot/vmlinux-6.0.0-6-amd64, while the installed image is /boot/vmlinuz-6.0.0-6-amd64. Note the x->z difference signaling a compressed kernel. Any idea how to get around this?

How to reproduce it

Build Debian package using git repo from https://salsa.debian.org/pere/falco.git (run 'debuild' from the devscripts package after running 'sudo apt build-dep .' in the git repo.

Expected behaviour

I expected falco to load the bpf module and start running, not exit with an error message.

*Environment**

jasondellaluce commented 1 year ago

cc @Andreagit97

Andreagit97 commented 1 year ago

hey @petterreinholdtsen you have a really recent kernel, this is quite strange, could you check some info?

  1. Run ls -la /sys/kernel/btf/vmlinux, you should see the vmlinux since from (from Kernel 5.5+) sys fs should exports the file

    -r--r--r-- 1 root root 5178599 gen 16 11:45 /sys/kernel/btf/vmlinux
  2. Could you check your kernel configs?

    You should have at least these 2 configs enabled:

    CONFIG_DEBUG_INFO_BTF=y
    CONFIG_DEBUG_INFO=y

    I've checked the last libbpf version and these are the only paths it checks https://github.com/libbpf/libbpf/blob/master/src/btf.c#L4771:

    const char *locations[] = {
        /* try canonical vmlinux BTF through sysfs first */
        "/sys/kernel/btf/vmlinux",
        /* fall back to trying to find vmlinux on disk otherwise */
        "/boot/vmlinux-%1$s",
        "/lib/modules/%1$s/vmlinux-%1$s",
        "/lib/modules/%1$s/build/vmlinux",
        "/usr/lib/modules/%1$s/kernel/vmlinux",
        "/usr/lib/debug/boot/vmlinux-%1$s",
        "/usr/lib/debug/boot/vmlinux-%1$s.debug",
        "/usr/lib/debug/lib/modules/%1$s/vmlinux",
    };
petterreinholdtsen commented 1 year ago

[Andrea Terzolo]

hey @petterreinholdtsen you have a really recent kernel, this is quite strange, could you check some info?

Sure.

@.:~# ls -la /sys/kernel/btf/vmlinux -r--r--r-- 1 root root 4212335 Jan 16 10:53 /sys/kernel/btf/vmlinux @.:~# grep CONFIG_DEBUG_INFO /boot/config-6.0.0-6-amd64 CONFIG_DEBUG_INFO=y

CONFIG_DEBUG_INFO_NONE is not set

CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y

CONFIG_DEBUG_INFO_DWARF4 is not set

CONFIG_DEBUG_INFO_DWARF5 is not set

CONFIG_DEBUG_INFO_REDUCED is not set

CONFIG_DEBUG_INFO_COMPRESSED is not set

CONFIG_DEBUG_INFO_SPLIT is not set

CONFIG_DEBUG_INFO_BTF=y CONFIG_DEBUG_INFO_BTF_MODULES=y @.***:~#

Here is the list of vmlinu* entries from strace:

@.:~# strace -f falco --modern-bpf 2>&1 |grep vmlinu [pid 5404] bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_RINGBUF, key_size=0, value_size=0, max_entries=8388608, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 72) = 7 [pid 5404] access("/sys/kernel/btf/vmlinux", R_OK) = 0 [pid 5404] openat(AT_FDCWD, "/sys/kernel/btf/vmlinux", O_RDONLY) = 8 [pid 5404] access("/boot/vmlinux-6.0.0-6-amd64", R_OK) = -1 ENOENT (No such file or directory) [pid 5404] access("/lib/modules/6.0.0-6-amd64/vmlinux-6.0.0-6-amd64", R_OK) = -1 ENOENT (No such file or directory) [pid 5404] access("/lib/modules/6.0.0-6-amd64/build/vmlinux", R_OK) = -1 ENOENT (No such file or directory) [pid 5404] access("/usr/lib/modules/6.0.0-6-amd64/kernel/vmlinux", R_OK) = -1 ENOENT (No such file or directory) [pid 5404] access("/usr/lib/debug/boot/vmlinux-6.0.0-6-amd64", R_OK) = -1 ENOENT (No such file or directory) [pid 5404] access("/usr/lib/debug/boot/vmlinux-6.0.0-6-amd64.debug", R_OK) = -1 ENOENT (No such file or directory) [pid 5404] access("/usr/lib/debug/lib/modules/6.0.0-6-amd64/vmlinux", R_OK) = -1 ENOENT (No such file or directory) [pid 5404] write(2, "libbpf: Error loading vmlinux BT"..., 38libbpf: Error loading vmlinux BTF: -3 @.:~#

-- Happy hacking Petter Reinholdtsen

petterreinholdtsen commented 1 year ago

Tried to extact the kernel using <URL: https://raw.githubusercontent.com/torvalds/linux/master/scripts/extract-vmlinux > and reran the strace, but even if the uncompressed kernel is available, no improvement:

@.:/boot# falco --modern-bpf Mon Jan 16 12:46:49 2023: Falco version: 0.33.1-1 (x86_64) Mon Jan 16 12:46:49 2023: Falco initialized with configuration file: /etc/falco/falco.yaml Mon Jan 16 12:46:49 2023: Loading rules from file /etc/falco/falco_rules.yaml Mon Jan 16 12:46:49 2023: Loading rules from file /etc/falco/falco_rules.local.yaml Mon Jan 16 12:46:49 2023: Loading rules from file /etc/falco/rules.d/nidhogg.yml Rules match ignored syscall: warning (ignored-evttype): Loaded rules match the following events: ppoll, semop, getdents, signaldeliver, getresuid, getegid, geteuid, getuid, sendfile, getresgid, pwrite, preadv, page_fault, pwritev, munlock, sendmmsg, io_uring_enter, fstat64, mlock2, getdents64, mlock, mlockall, fsconfig, select, copy_file_range, io_uring_register, getcwd, mmap2, mprotect, send, writev, recvmmsg, lseek, poll, munmap, llseek, epoll_wait, stat64, access, fstat, lstat, stat, futex, lstat64, pluginevent, getpeername, semget, write, brk, getsockname, pread, setsockopt, recv, getgid, nanosleep, readv, getrlimit, switch, semctl, munlockall, mmap, splice, read But these events are not returned unless running falco with -A Mon Jan 16 12:46:49 2023: The chosen syscall buffer dimension is: 8388608 bytes (8 MBs) Mon Jan 16 12:46:49 2023: Starting health webserver with threadiness 1, listening on port 8765 Mon Jan 16 12:46:49 2023: Enabled event sources: syscall Mon Jan 16 12:46:49 2023: Opening capture with modern BPF probe libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 libbpf: failed to load object 'bpf_probe' libbpf: failed to load BPF skeleton 'bpf_probe': -3 libpman: failed to load BPF object (errno: 3 | message: No such process) Mon Jan 16 12:46:49 2023: An error occurred in an event source, forcing termination... Events detected: 0 Rule counts by severity: Triggered rules by rule name: Error: @.:/boot#

@.:/boot# strace -f falco --modern-bpf 2>&1 |grep vmlinu[pid 6547] bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_RINGBUF, key_size=0, value_size=0, max_entries=8388608, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 72) = 7 [pid 6547] access("/sys/kernel/btf/vmlinux", R_OK) = 0 [pid 6547] openat(AT_FDCWD, "/sys/kernel/btf/vmlinux", O_RDONLY) = 8 [pid 6547] access("/boot/vmlinux-6.0.0-6-amd64", R_OK) = 0 [pid 6547] openat(AT_FDCWD, "/boot/vmlinux-6.0.0-6-amd64", O_RDONLY|O_CLOEXEC) = 8 [pid 6547] access("/lib/modules/6.0.0-6-amd64/vmlinux-6.0.0-6-amd64", R_OK) = -1 ENOENT (No such file or directory) [pid 6547] access("/lib/modules/6.0.0-6-amd64/build/vmlinux", R_OK) = -1 ENOENT (No such file or directory) [pid 6547] access("/usr/lib/modules/6.0.0-6-amd64/kernel/vmlinux", R_OK) = -1 ENOENT (No such file or directory) [pid 6547] access("/usr/lib/debug/boot/vmlinux-6.0.0-6-amd64", R_OK) = -1 ENOENT (No such file or directory) [pid 6547] access("/usr/lib/debug/boot/vmlinux-6.0.0-6-amd64.debug", R_OK) = -1 ENOENT (No such file or directory) [pid 6547] access("/usr/lib/debug/lib/modules/6.0.0-6-amd64/vmlinux", R_OK) = -1 ENOENT (No such file or directory) [pid 6547] write(2, "libbpf: Error loading vmlinux BT"..., 38libbpf: Error loading vmlinux BTF: -3 @.:/boot#

-- Happy hacking Petter Reinholdtsen

Andreagit97 commented 1 year ago

BTF file seems to be here as we expect but for some reason, libbpf fails to parse it... It seems more like a libbpf issue than a Falco one, BTW we must be sure of that before reporting it to the mailing list. I will try to craft a little reproducible example unrelated to Falco, just to see if we are able to trigger again the issue, I will post the link to it here when it is ready, it would be amazing if you could try it just to be sure there is an issue with libbpf :)

Andreagit97 commented 1 year ago

ei @petterreinholdtsen here we are https://github.com/Andreagit97/BPF-perf-tests. You should build the btf_loading example with make btf_loading

Andreagit97 commented 1 year ago

if we face issues also in this case it means that there is something wrong with how libbpf parses BTF on your kernel, in this repo I use a different libbpf version from the one we use in Falco, just to see if something changes. We can simply change it in a second step if necessary :)

petterreinholdtsen commented 1 year ago

[Andrea Terzolo]

ei @petterreinholdtsen here we are https://github.com/Andreagit97/BPF-perf-tests. You should build the btf_loading example with make btf_loading

After building in a chroot and copying the binary to the VM , this is the output from the testrun:

/tmp/btf_loading

libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 libbpf: failed to load object 'btf_loading_bpf' libbpf: failed to load BPF skeleton 'btf_loading_bpf': -3 Failed to load and verify BPF skeleton. Errno: 3, message: No such process #

-- Happy hacking Petter Reinholdtsen

Andreagit97 commented 1 year ago

Uhm thank you for this, could you run the program with the verbose mode enabled btf_loading --verbose ? In this way we should see the reason of failure or at least some hints :thinking:

petterreinholdtsen commented 1 year ago

[Andrea Terzolo]

Uhm thank you for this, could you run the program with the verbose mode enabled btf_loading --verbose ? In this way we should see the reason of failure or at least some hints :thinking:

Sure.

@.:~/src/BPF-perf-tests/templates# /tmp/btf_loading --verbose libbpf: loading object 'btf_loading_bpf' from buffer libbpf: elf: section(3) tp_btf/sys_enter, size 56, link 0, flags 6, type=1 libbpf: sec 'tp_btf/sys_enter': found program 'sys_enter_trace' at insn offset 0 (0 bytes), code size 7 insns (56 bytes) libbpf: elf: section(4) .reltp_btf/sys_enter, size 16, link 12, flags 40, type=9 libbpf: elf: section(5) license, size 13, link 0, flags 3, type=1 libbpf: license of btf_loading_bpf is Dual BSD/GPL libbpf: elf: section(6) .bss, size 8, link 0, flags 3, type=8 libbpf: elf: section(7) .BTF, size 419, link 0, flags 0, type=1 libbpf: elf: section(9) .BTF.ext, size 96, link 0, flags 0, type=1 libbpf: elf: section(12) .symtab, size 120, link 1, flags 0, type=2 libbpf: looking for externs among 5 symbols... libbpf: collected 0 externs total libbpf: map 'btf_load.bss' (global data): at sec_idx 6, offset 0, flags 400. libbpf: map 0 is "btf_load.bss" libbpf: sec '.reltp_btf/sys_enter': collecting relocation for section(3) 'tp_btf/sys_enter' libbpf: sec '.reltp_btf/sys_enter': relo #0: insn #0 against 'stop' libbpf: prog 'sys_enter_trace': found data map 0 (btf_load.bss, sec 6, off 0) for insn 0 libbpf: Unsupported BTF_KIND:19 libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': -22 libbpf: Unsupported BTF_KIND:19 libbpf: loading kernel BTF '/boot/vmlinux-6.0.0-6-amd64': -22 libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 libbpf: failed to load object 'btf_loading_bpf' libbpf: failed to load BPF skeleton 'btf_loading_bpf': -3 Failed to load and verify BPF skeleton. Errno: 3, message: No such process @.:~/src/BPF-perf-tests/templates#

-- Happy hacking Petter Reinholdtsen

petterreinholdtsen commented 1 year ago

It occured to me that perhaps you want to see the messages also without the unpacked kernel:

@.:~/src/BPF-perf-tests/templates# /tmp/btf_loading --verbose libbpf: loading object 'btf_loading_bpf' from buffer libbpf: elf: section(3) tp_btf/sys_enter, size 56, link 0, flags 6, type=1 libbpf: sec 'tp_btf/sys_enter': found program 'sys_enter_trace' at insn offset 0 (0 bytes), code size 7 insns (56 bytes) libbpf: elf: section(4) .reltp_btf/sys_enter, size 16, link 12, flags 40, type=9 libbpf: elf: section(5) license, size 13, link 0, flags 3, type=1 libbpf: license of btf_loading_bpf is Dual BSD/GPL libbpf: elf: section(6) .bss, size 8, link 0, flags 3, type=8 libbpf: elf: section(7) .BTF, size 419, link 0, flags 0, type=1 libbpf: elf: section(9) .BTF.ext, size 96, link 0, flags 0, type=1 libbpf: elf: section(12) .symtab, size 120, link 1, flags 0, type=2 libbpf: looking for externs among 5 symbols... libbpf: collected 0 externs total libbpf: map 'btf_load.bss' (global data): at sec_idx 6, offset 0, flags 400. libbpf: map 0 is "btf_load.bss" libbpf: sec '.reltp_btf/sys_enter': collecting relocation for section(3) 'tp_btf/sys_enter' libbpf: sec '.reltp_btf/sys_enter': relo #0: insn #0 against 'stop' libbpf: prog 'sys_enter_trace': found data map 0 (btf_load.bss, sec 6, off 0) for insn 0 libbpf: Unsupported BTF_KIND:19 libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': -22 libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 libbpf: failed to load object 'btf_loading_bpf' libbpf: failed to load BPF skeleton 'btf_loading_bpf': -3 Failed to load and verify BPF skeleton. Errno: 3, message: No such process @.:~/src/BPF-perf-tests/templates#

-- Happy hacking Petter Reinholdtsen

Andreagit97 commented 1 year ago

Uhm so the error seems to be here https://github.com/libbpf/libbpf/blob/3423d5e7cdab356d115aef7f987b4a1098ede448/src/btf.c#L709, BTF_KIND_ENUM64 seems to be not handled, btw this requires very deep knowledge on BTF so probably better to report it to the mailing list.

Just one final info, could you dump the BTF file?

bpftool btf dump file /sys/kernel/btf/vmlinux format raw > vmlinux.h 

not sure you can upload it here since it would be quite huge, maybe you can search for something like ENUM_64 or ENUM64 inside it. if you find them probably the issue is exactly what I mentioned before :thinking:

petterreinholdtsen commented 1 year ago

Here it is. btfdump.txt

Andreagit97 commented 1 year ago

yeah there are some ENUM64, it seems like libbpf is not able to parse them yet, it would be amazing if you could report it to the bpf@vger.kernel.org mailing list, you should provide the full log error (https://github.com/falcosecurity/falco/issues/2357#issuecomment-1384218517) the example we used to detect the error and the kernel version, OS version, it should be enough

petterreinholdtsen commented 1 year ago

[Andrea Terzolo]

yeah there are some ENUM64, it seems like libbpf is not able to parse them yet, it would be amazing if you could report it to the @.***` mailing list,

I do not really understand enough of this to explain anything to the libbpf developers. I have just been parroting instructions from you so far. If they had question, I would not be able to answer them.

-- Happy hacking Petter Reinholdtsen

Andreagit97 commented 1 year ago

Ok I can do that, probably they will have some questions regarding your machine, in that case, I will notify you here

petterreinholdtsen commented 1 year ago

[Andrea Terzolo]

Ok I can do that, probably they will have some questions regarding your machine, in that case, I will notify you here

The machine is a Qemu based virtual machine with a Debian Bookworm installation. <URL: https://tracker.debian.org/pkg/libbpf > show the version installed on Bookworm at the moment is 1.1.0-1.

-- Happy hacking Petter Reinholdtsen

poiana commented 1 year ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Andreagit97 commented 1 year ago

Update: it turns out that there is no issue with libbpf it seems an issue with your machine :/ Do you have any updates on this issue?

Andreagit97 commented 1 year ago

/remove-lifecycle stale

poiana commented 1 year ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana commented 1 year ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

poiana commented 1 year ago

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

poiana commented 1 year ago

@poiana: Closing this issue.

In response to [this](https://github.com/falcosecurity/falco/issues/2357#issuecomment-1741794016): >Rotten issues close after 30d of inactivity. > >Reopen the issue with `/reopen`. > >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Provide feedback via https://github.com/falcosecurity/community. >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
ljlu1504 commented 3 months ago

Update: it turns out that there is no issue with libbpf it seems an issue with your machine :/ Do you have any updates on this issue?

do you have libbpf issue link ? I have similar symptom with this.

libbpf: prog 'netif_rx_hook': found map 1 (pt_map, sec 13, off 32) for insn #202 libbpf: sec '.reltracepoint/net/netif_receive_skb': relo #2: insn #264 against 'pt_map' libbpf: prog 'netif_rx_hook': found map 1 (pt_map, sec 13, off 32) for insn #264 libbpf: Unsupported BTF_KIND:19 libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': -22 libbpf: Unsupported BTF_KIND:19 libbpf: loading kernel BTF '/lib/modules/6.8.0-dirty/build/vmlinux': -22 libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 libbpf: failed to load object 'tcpping_bpf' libbpf: failed to load BPF skeleton 'tcpping_bpf': -3 failed to load BPF object