bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.58k stars 506 forks source link

eBPF builds fail in images with the 5.10 kernel in aarch64 #2947

Closed arnaldo2792 closed 1 year ago

arnaldo2792 commented 1 year ago

Image I'm using: Any image with the 5.10 kernel in aarch64

What I expected to happen: I can compile the eBPF probe regardless of the kernel / architecture combo

What actually happened: The compilation failed for the 5.10 kernel in aarch64

How to reproduce the problem: Attempt to compile the falco eBPF probe in an image with the 5.10 kernel in aarch64. You will see an error similar to this:

falco-driver-loader * Trying to compile the eBPF probe (falco_bottlerocket_5.10.165_1_1.13.1-aws.o)
falco-driver-loader Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/insn.h' differs from latest version at 'arch/arm64/include/asm/insn.h'
falco-driver-loader Warning: Kernel ABI header at 'tools/arch/arm64/lib/insn.c' differs from latest version at 'arch/arm64/lib/insn.c'
falco-driver-loader gcc: error: unrecognized command line option '-mbranch-protection=pac-ret+leaf'
falco-driver-loader gcc: error: unrecognized command line option '-fpatchable-function-entry=2'
falco-driver-loader make[2]: *** [scripts/Makefile.build:286: scripts/mod/empty.o] Error 1
falco-driver-loader make[1]: *** [Makefile:1681: modules_prepare] Error 2
falco-driver-loader make: *** [Makefile:38: all] Error 2
falco-driver-loader mv: cannot stat '/usr/src/falco-4.0.0+driver/bpf/probe.o': No such file or directory
falco-driver-loader Unable to load the falco eBPF probe
stmcginnis commented 1 year ago

@markusboehme - I seem to remember you being involved in investigating something related to this, but I can't find that now. Do you know if this is still an issue? I think I saw something in the #falco channel in the k8s slack that someone stated they were able to build an aarch64 now.

markusboehme commented 1 year ago

@stmcginnis Sorry I had missed this until now! The last thing related to eBPF I debugged probably was #2504 which is unrelated. In particular the gcc errors make me believe the wrong compiler is being used to compile the eBPF probe, i.e. not the one from the SDK (or rather not the one from the toolchain produced by Buildroot; the SDK has another gcc as part of its base image).

The reproduction steps in this issue don't provide enough context for me to reproduce yet, so I'll have to read up on this. Assigning to me for now.

markusboehme commented 1 year ago

Here's the repro steps I ended up using.

  1. helm repo add falcosecurity https://falcosecurity.github.io/charts && helm repo update
  2. helm install falco falcosecurity/falco --namespace falco --create-namespace --set collectors.crio.enabled=false --set collectors.docker.enabled=false --set collectors.containerd.socket=/run/dockershim.sock --set podSecurityContext.seLinuxOptions.user=system_u --set podSecurityContext.seLinuxOptions.role=system_r --set podSecurityContext.seLinuxOptions.type=control_t --set podSecurityContext.seLinuxOptions.level=s0-s0:c0.c1023 --set driver.kind=ebpf --set driver.loader.initContainer.args={--compile}
  3. kubectl logs -n falco -c falco-driver-loader -f -l app.kubernetes.io/name=falco

Step 2 is lifted from https://github.com/bottlerocket-os/bottlerocket/issues/2275#issuecomment-1186315503 (thanks Ben!), but forcing compilation of the eBPF probe. If not forced to compile the probe, falco-driver-loader will download and use a pre-built version of it if there's a matching one available.

Using these steps I reproduced the error from the overview and tested various Bottlerocket releases, both for aarch64 and x86_64 and both for the 5.10 and the 5.15 kernel series. Compilation of the probe fails with the same error for both the 5.10 and 5.15 kernel on aarch64 across Bottlerocket releases going as far back as 1.10.0 (I didn't bother going back more than half a year). Compilation succeeds for x86_64 on both kernel series, though with warnings about a mismatched compiler version (a Debian build of gcc 5 vs. Bottlerocket's tool chain).

As I'm not familiar with Falco, taking a look at what's being compiled there and how.

markusboehme commented 1 year ago

What confused me at first was why the Bottlerocket kernel config requiring certain compiler options affected building the eBPF probe, since it's not a kernel module. However, the probe's Makefile hooking into the kernel build system by pretending to be a module is one part of it, and the other is Bottlerocket's downstream patch for building out-of-tree modules that requires some tools to be rebuilt on the host with the target architecture when compiling a kernel module.

The falco-driver-loader container is based on Debian 10 ("Buster") and ships gcc 5.x, 6.x, 8.x. Bottlerocket's kernel config requires support for patchable function entry prologues (-fpatchable-function-entry) and Arm pointer authentication (-mbranch-protection). The earliest gcc release providing both is gcc 9.x, hence the current default falco-driver-loader image is unsuitable for compiling either form of Falco driver, i.e. both the probe and the module.

The Falco project does provide pre-built probes and modules for several distributions including Bottlerocket. The falco-driver-loader attempts to use those first, and only goes on to compile its own versions if it can't find any or if explicitly requested as I did above. This automation (driverkit) uses a more modern builder image with more modern toolchain for building the drivers for Bottlerocket.

As an aside: The drivers have to be rebuilt on kernel updates, or in Bottlerocket's case for new releases. Since this is done periodically only, there naturally can be a window between Bottlerocket release and Falco driver update where Falco on Bottlerocket will not find a matching driver and attempt to build its own.

Ideally, the Falco project could update the base image of falco-driver-loader as well to include a more recent version of gcc (e.g. Debian 11 ("Bullseye") which includes gcc 10.x). I'll check for or open a corresponding issue in their tracker tomorrow.

markusboehme commented 1 year ago

Ideally, the Falco project could update the base image of falco-driver-loader as well to include a more recent version of gcc (e.g. Debian 11 ("Bullseye") which includes gcc 10.x). I'll check for or open a corresponding issue in their tracker tomorrow.

Turns out the Falco community has started discussing doing just that quite recently: https://github.com/falcosecurity/falco/issues/2489

Another alternative to look into could be the "modern BPF" driver that uses the BPF Type Format (BTF) and doesn't require recompilation, thereby sidestepping the issue. It can be chosen by passing --set driver.kind=modern-bpf to Helm, but is currently marked as experimental (so please do research this option before deciding to make the switch).

markusboehme commented 1 year ago

Bottom line: There exists a time window after Bottlerocket releases in which no pre-compiled Falco drivers exist yet, and nodes attempt to compile those themselves. This fails for Bottlerocket due to the combination of a necessary downstream patch to Bottlerocket's kernel build system and an old toolchain in Falco's driver loader image. The self-compilation failure could be fixed by updating the base image of the Falco driver loader (https://github.com/falcosecurity/falco/issues/2489) and eventually becomes a non-issue entirely when Falco's new "modern BPF" driver leaves its experimental stage.