libbpf / ci

BPF CI
Other
7 stars 21 forks source link

LLVM feature detection and bpftool's runtime dependencies #101

Closed qmonnet closed 8 months ago

qmonnet commented 1 year ago

TL;DR: I'm trying to fix feature detection for LLVM in the kernel's tools/build system. If I do, this breaks the CI because bpftool misses the LLVM libraries as runtime dependencies.

Context

In the kernel repository, under tools/build/feature, we can test for a number of "features" that may or may not be supported by the host for building tools. These are used by perf or bpftool, for example. In particular, there is a probe to detect the availability of LLVM libraries. But the detection is broken for LLVM v16+, and new versions of LLVM are always reported as missing. I'm trying to fix that.

The fix is simple. So I've got a patch in preparation for it. But this patch also breaks the BPF CI.

Issue

Here's what happens:

The reason we never noticed the issue before is likely that the runtime dependencies are met for the alternative libbfd-based disassembler (binutils is installed in the rootfs), and that the CI has likely been using LLVM 16 or higher since before the switch to the LLVM disassembler by default was merged.

Possible fix

The "easy" fix would be to add the LLVM libraries to the rootfs image (using the same version as used to compile bpftool, not just Debian's llvm-dev). But I'm not sure this is desirable, as it would increase the size and build time for the image. So I'm opening this Issue to get some feedback and see whether this is acceptable, or if someone has a better alternative to suggest.

Note: We don't need the JIT-disassembler at all for the selftests at the moment, so if we could build bpftool without LLVM support for selftests, this should solve the issue. But we don't currently have a clean way to disable LLVM support in bpftool (if the libraries are here, we use them, period). We could possibly hack something by crafting and passing a specific value for $(FEATURE_TESTS), or abusing $(LLVM_CONFIG) to make the LLVM feature detection fail, but both hacks would require a way to pass down arguments to the call to $(MAKE) for building bpftool from selftests' Makefile, which we currently don't have. And it doesn't look super clean. Similarly, building bpftool statically would require passing down the right flags, as well as getting compatible LLVM libraries (the default ones can only be used for linking dynamically - we'd have to build them or download some, like in bpftool's CI).

anakryiko commented 1 year ago

what if we use bpftool-specific make params to enable/disable some features. If they are envvars, they should be propagatable into bpftool's Makefile, right? E.g.:

BPFTOOL_USE_LLVM=0 make -j90 -C tools/testing/selftests/bpf

Would that work?

qmonnet commented 1 year ago

It would work, but we'd rather avoid adding parameters that could encourage distros to turn features off, so I'd like to find a way without this.

qmonnet commented 1 year ago

Haven't spent more time on this, but I need to check whether we can just use the bootstrap version of bpftool for this test. This way we wouldn't have to care about the LLVM dependency.

[edit: From what I understand we're already supposed to use the bootstrap version in core_reloc since torvalds/linux@b03e19465b972bd06104207380e0e42e7f03ab29. I didn't realise that the bootstrap version would include the disassembler, this is since torvalds/linux@d510296d331accd4afaa13498220c93ae690628a - I wonder if this is still necessary, seems to me the additional files were added to the bootstrap version by mistake :thinking:]

qmonnet commented 9 months ago

I haven't investigated and figured out the details of the above, but I realised today that the initial issue, the detection for the llvm feature with LLVM 16+, has been fixed upstream with torvalds/linux@4e95ed4f4d5bc6838a10e6952999b41b1d07e56f (July 25th).

The feature has since broken again, in a different manner, with torvalds/linux@56b11a2126bf2f422831ecf6112b87a4485b221b from August 11th (the fix was discussed a few days ago, going through the perf tree).

Between the two, it seems that the CI didn't break because of bpftool, so the current GitHub is probably moot.

I'll leave the issue open for a bit longer anyway, just to see how things behave when llvm detection is fixed again - I'm curious to see if bpftool builds with libbfd or LLVM in the CI.