iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.57k stars 3.88k forks source link

Required dependencies for libbpf-tools? #3224

Open mika opened 3 years ago

mika commented 3 years ago

Following up on Brendan Gregg's excellent http://www.brendangregg.com/blog/2020-11-04/bpf-co-re-btf-libbpf.html I looked into packaging the libbpf-tools binaries (biolatency, biopattern, biosnoop, biostacks, bitesize, cpudist, drsnoop, execsnoop, filelife, hardirqs, llcstat, numamove, opensnoop, readahead, runqlat, runqlen, runqslower, softirqs, syscount, tcpconnect, tcpconnlat, vfsstat + xfsslower), to be able to use them on systems without having to install lots of additional packages.

I did my packaging work on current Debian testing/bullseye, see https://gist.github.com/mika/e55ab72659bda90d8fecbbb42830d250

Most of the binaries work as intended, though readahead, xfsslower + llcstat are failing for me (both on the system where I built the binaries as well as on another Debian/bullseye system with just the compiled binaries and an according kernel with CONFIG_DEBUG_INFO_BTF enabled):

% sudo readahead
libbpf: failed to find kernel BTF type ID of '__do_page_cache_readahead': -3
libbpf: failed to load object 'readahead_bpf'
libbpf: failed to load BPF skeleton 'readahead_bpf': -3
failed to open and/or load BPF ojbect

% sudo xfsslower
libbpf: kprobe perf_event_open() failed: No such file or directory
libbpf: prog 'xfs_file_read_iter': failed to create kprobe 'xfs_file_read_iter' perf event: No such file or directory
libbpf: failed to auto-attach program 'xfs_file_read_iter': -2
failed to attach BPF programs

% sudo llcstat
failed to init perf sampling: No such file or directory
% sudo strace -f -s500 llcstat 2>&1 | tail
munmap(0x7f17e10f2000, 360448)          = 0
perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=PERF_COUNT_HW_CACHE_MISSES, ...}, -1, 0, -1, 0) = -1 ENOENT (No such file or directory)
write(2, "failed to init perf sampling: No such file or directory\n", 56failed to init perf sampling: No such file or directory) = 56
[...]

Any hints what's missing here? (I'd like to ensure the BTF support is as good as possible on Debian/bullseye, being the upcoming stable release of Debian.)

Thanks!

anakryiko commented 3 years ago

So there shouldn't be any runtime dependencies in terms of libraries, beyond libz and libelf (for libbpf itself).

For readahead, seems like __do_page_cache_readahead got renamed into do_page_cache_ra between 5.9 and 5.10 kernels, so we'll need to dynamically detect which name is right and attach accordingly. @ethercflow, will you get a chance to do this and test on 5.9 and 5.10 kernels?

For xfsslower, seems like xfs_file_read_iter is still there. I wonder if your kernel is built with CONFIG_XFS_FS enabled? If not, we'll need to dig deeper why it suddenly broke.

For llcstat, it appears as if PERF_COUNT_HW_CACHE_MISSES is not supported? Which sounds wrong for any more or less modern CPU, but I'm not an expert on that. But just to verify, try substituting PERF_TYPE_HARDWARE for PERF_TYPE_SOFTWARE and PERF_COUNT_HW_CACHE_MISSES/PERF_COUNT_HW_CACHE_REFERENCES for PERF_COUNT_SW_CPU_CLOCK. It will give you totally wrong results, but at least we'll know if it's a problem in hardware support. See runqlen.c as an example (I assume runqlen works fine in your environment, right?).

mika commented 3 years ago

Regarding xfsslower: oh right, after modprobe xfs it indeed works. Sorry for not considering this myself.

Regarding llcstat: with your suggested changes applied, it indeed works as intended. Yes, also runqlen works in my environment.

FTR, I am testing this from inside a VirtualBox VM:

% lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   39 bits physical, 48 bits virtual
CPU(s):                          1
On-line CPU(s) list:             0
Thread(s) per core:              1
Core(s) per socket:              1
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           142
Model name:                      Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
Stepping:                        10
CPU MHz:                         1991.970
BogoMIPS:                        3983.94
Hypervisor vendor:               KVM
Virtualization type:             full
L1d cache:                       32 KiB
L1i cache:                       32 KiB
L2 cache:                        256 KiB
L3 cache:                        8 MiB
NUMA node0 CPU(s):               0
Vulnerability Itlb multihit:     KVM: Mitigation: VMX unsupported
Vulnerability L1tf:              Mitigation; PTE Inversion
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full generic retpoline, STIBP disabled, RSB filling
Vulnerability Srbds:             Unknown: Dependent on hypervisor status
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqd
                                 q monitor ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase avx2 invpcid rdseed clflushopt md_clear flush_l1d

Thanks, @anakryiko!

anakryiko commented 3 years ago

Regarding llcstat: with your suggested changes applied, it indeed works as intended. Yes, also runqlen works in my environment.

FTR, I am testing this from inside a VirtualBox VM:

so it might be that VirtualBox doesn't support/pass-through hardware counters. For qemu I use -cpu host argument, which probably makes this all work. You'll need to figure out what to do for VirtualBox to make this work.

anakryiko commented 3 years ago

@mika, seems like there are efforts to package libbpf-tools on Fedora and ALT Linux. Would you mind joining the discussion so that it stays as consistent as possible across various distros? Please see https://github.com/iovisor/bcc/pull/3263#issuecomment-777023822, thanks!

mika commented 3 years ago

@anakryiko thank you for the pointer, will join!