elastic / ebpf

Elastic's eBPF
Other
67 stars 11 forks source link

Fix endpoint build issues #116

Closed rhysre closed 2 years ago

rhysre commented 2 years ago

This PR fixes two endpoint build issues.

tty_write problem details [pahole](https://github.com/acmel/dwarves) is the tool run during the kernel build process to generate BTF information. Pre Pahole 1.22 however, it would not generate BTF information for the signature of tty_write on ARM64 (x86 works fine). [This change](https://github.com/acmel/dwarves/commit/58a98f76ac95b1bb11920ff2b58206b2364e6b3b) inadvertently fixed things in 1.22. Basically, they wanted to limit the amount of BTF generated to save space, so they only generated BTF for functions that matched certain criteria. One of those criteria was that a function had to be known to ftrace. This was a reasonable assumption as functions known to ftrace are the most likely to be traced, and thus the most likely to require BTF type information (so BTF probes can nicely parse their arguments). The change I cited removed this constraint, as it was decided that BTF information can be generated for everything and that the utility outweighs the extra space. Now the list of functions known to ftrace is stored in a list in the kernel binary started by the symbol `__mcount_loc_start` and ended by the symbol `__mcount_loc_end`. If you just map the binary into memory using mmap and seek to that symbol, theoretically you'll have a list of addresses that correspond to functions known to ftrace, and indeed, on x86 this is the case. For an ARM64 build however (with the default kernel config), that list is entirely nulls. The reason for this is that instead of just putting the addresses inline in the list, on an ARM64 build, the kernel build process leaves the list in the binary as nulls, and then generates a bunch of relocations such that the list is filled in by the kernel when it's patching up relocations at boot. Naturally then, that list will be populated when the kernel is booted, but if you just map the kernel binary into memory (as pahole does at build time) and try to read from the array to get the ftrace list, it'll just be nulls. This difference in behaviour is due to the fact that the ARM64 build uses the [recordmcount](https://github.com/torvalds/linux/blob/master/scripts/recordmcount.c) script at build time to generate the array (which generates relocations for each item) while on x86, this is done in GCC via the -mrecord-mcount flag (which doesn't generate relocations, it just puts the data straight up in the list). These relocations can be inlined at build time by disabling `CONFIG_RELOCATABLE`, but seems almost no ARM64 kernels do this. If you're interested in why ftrace needs that list see: https://www.brendangregg.com/blog/2019-10-15/kernelrecipes-kernel-ftrace-internals.html (excellent talk)