sustainable-computing-io / kepler

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and exports them as Prometheus metrics
https://sustainable-computing.io
Apache License 2.0
1.06k stars 169 forks source link

[rfe] default bpf sample rate #1553

Open jtaleric opened 2 weeks ago

jtaleric commented 2 weeks ago

What would you like to be added?

We should consider dropping the sampling rate to EXPERIMENTAL_BPF_SAMPLE_RATE: 1000

We have seen decent improvement from the CPU utilization side when reducing the sampling rate.

Risk - We have not quantified the possible loss of granularity of the power data when enabling this, however when comparing to our redfish raw data it is still very close. image

Why is this needed?

CPU usage reduction.

dave-tucker commented 2 weeks ago

We'll need to retest on main since #1481 has changed what's done before the sampling check in the eBPF code vs. what was done in the last released version. There might not be an appreciable difference in probe execution time with/without sampling set. I'll set up some micro-benchmarks to confirm once #1438 has gone in since that's blocking easier benchmarking/testing of the eBPF code.