sustainable-computing-io / kepler

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and exports them as Prometheus metrics
https://sustainable-computing.io
Apache License 2.0
1.18k stars 184 forks source link

How to reduce the sampling rate? #539

Closed andersonandrei closed 1 year ago

andersonandrei commented 1 year ago

Hello everyone,

I have worked with Serverless platforms on top of Kubernetes. More specifically, OpenWhisk. There I execute small functions with fast execution - at most 5 minutes, but about 1 minute in general. I want to extract their energy metrics with Kepler, but it seems that the sampling rate by default is 15s. I want to reduce it to 5s or even 1s. Can you help me, please?

Thank you very much!

rootfs commented 1 year ago

@andersonandrei

Kepler metrics are accrued over time, data are not lost. But there are reporting (aka sampling) latency, so if you want to see them in a more timely manner, you can try with lower latency settings.

There are two places that set the sampling interval: the ServiceMonitor and kepler.

The default Prometheus scraping interval is 3s in the default ServiceMonitor manifest

The kepler sampling interval is hardcoded (you are welcome to make a PR to make it configurable) is 3s.

There might be some latency from Prometheus or Grafana dashboard. If you can help us figure that out, that'll be great.

andersonandrei commented 1 year ago

Hi @rootfs , thank you very much for your answer.

I will try to work with 3s of sampling intervals for a while and then I will try to modify the code to be 1s.

Besides, I have not been able to set it to 3s yet, even using the interval of 3s as you exemplified with the Service Monitor . The sampling interval that I get, through Prometheus, after deploying Kepler, is 14s. Can you show me an example of how to set up Prometheus for that, please?

Thanks again!

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

marceloamaral commented 1 year ago

@andersonandrei were you able to fix your problem? Can we close this issue?

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.