grafana / beyla

eBPF-based autoinstrumentation of web applications and network metrics
https://grafana.com/oss/beyla-ebpf/
Apache License 2.0
1.41k stars 99 forks source link

Limit impact on k8s apiserver in large clusters #824

Open dashpole opened 6 months ago

dashpole commented 6 months ago

What I would like to be able to do

I mentioned this briefly at the community meeting earlier today.

As a general best practice, DaemonSets should avoid watching a resource cluster-wide, such as watching all pods, all replicasets, all services, etc. Doing this can limit the maximum possible number of nodes in a cluster. It is acceptable to watch pods assigned to the same node as the DaemonSet pod. That actually generates less load on the kube-apiserver than a deployment with multiple replicas watching all pods, since the traffic is roughly O(pods * replicas) for the deployment. Ideally, I would like to be able to run Beyla with the following architecture:

To do that, it would be nice to have more control over which k8s resources beyla watches. This would typically be done using field selectors, similar to the prometheus server's selectors config in kubernetes_sd_configs.

Alternatives considered

The above will work well for single-application metrics, like HTTP golden signal metrics for a pod, since all relevant pod information is about pods running on the node. However, if I want to make a service graph, the above approach won't work, as I would also need pod information about pods running on other nodes, which defeats the purpose of the improvement. I had considered doing all IP -> Pod mapping in a deployment to enable that use-case.

The issue I ran into is filtering. At least on GKE, there is a bunch of traffic to things I don't really care about (e.g. kubelet health checks). I would like to be able to filter out things that aren't a pod, and only collect telemetry for pods, but I couldn't figure out how to do that (and couldn't think of a good way to implement it, either).

dimunech commented 5 months ago

To expand on this - Kubernetes metadata decorator adds considerable load to Kubernetes API servers. Here's a graph of master nodes memory usage before and after disabling the decorator (yellow annotation on the graph).

Screenshot 2024-05-31 at 10 52 42
marctc commented 2 months ago

I believe this issue was fixed on #997. Please reopen this issues if that's not the case, thanks!

dashpole commented 2 months ago

Awesome, thank you!