grafana / beyla

eBPF-based autoinstrumentation of web applications and network metrics
https://grafana.com/oss/beyla-ebpf/
Apache License 2.0
1.19k stars 77 forks source link

Update documentation for running beyla without privileges #785

Closed dashpole closed 2 weeks ago

dashpole commented 3 weeks ago

Follow-up to https://github.com/grafana/beyla/pull/741

This updates the documentation for running unprivileged to not require SYS_ADMIN.

This also updates the error message encountered when bpf_probe_write_user fails to indicate that the SYS_ADMIN capability is required.

I still need to test with the setup described in the documentation:

The following guide is based on tests performed mainly by running containerd with kubeadm, k3s, microk8s and kind.

grcevski commented 3 weeks ago

The integration test failure is intermittent and something I caused 2 days ago. I just merged a PR this morning, I hope that merging with main will make it go away.

dashpole commented 3 weeks ago

I've tested this on GKE. I'm not sure i'll have time to test this on the setup described in the docs for a bit.

grcevski commented 3 weeks ago

I've tested this on GKE. I'm not sure i'll have time to test this on the setup described in the docs for a bit.

That's fine, we'll test on the on the others. If it's not exactly the same, we can amend the docs again to spell out the differences.

grcevski commented 2 weeks ago

I've confirmed that this works well on others. I'm going to merge this and open a new PR with a note about folks deploying this themselves on premise. Namely, there could be an issue with PERFMON (instead of SYS_ADMIN) when the Linux kernel perf event permissions are set to high paranoid levels, e.g. the value inside /proc/sys/kernel/perf_event_paranoid. Any value over 2 will require CAP_SYS_ADMIN and kprobes rely on perf events.

The relevant docs for the kernel say this (https://www.kernel.org/doc/Documentation/sysctl/kernel.txt):

perf_event_paranoid:

Controls use of the performance events system by unprivileged
users (without CAP_SYS_ADMIN).  The default value is 2.

 -1: Allow use of (almost) all events by all users
     Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>=0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
     Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>=1: Disallow CPU event access by users without CAP_SYS_ADMIN
>=2: Disallow kernel profiling by users without CAP_SYS_ADMIN

However if one is using a debian based distro, the value can be higher than 2 (according to this article https://lwn.net/Articles/696216/). With 3 it says (i.e. kernel.perf_event_paranoid=3) that restricts perf_event_open() to processes with the CAP_SYS_ADMIN capability.

And I've confirmed that Beyla fails like this:

time=2024-04-26T17:12:22.342Z level=ERROR msg="Unable to load eBPF watcher for process events" component=discover.ProcessWatcher interval=5s error="instrumenting function \"sys_bind\": setting kprobe: creating perf_kprobe PMU (arch-specific fallback for \"sys_bind\"): token  │
│ 
grcevski commented 2 weeks ago

Thanks so much @dashpole for your contribution!