giantswarm / roadmap

Giant Swarm Product Roadmap
https://github.com/orgs/giantswarm/projects/273
Apache License 2.0
3 stars 0 forks source link

Check the resource usage of Alloy #3724

Open Rotfuks opened 4 days ago

Rotfuks commented 4 days ago

Motivation

We've seen some weird numbers when checking the golem installation on which alloy is already rolled out. It seems that alloy is using significantly more resources than promtail and prometheus-agent combined. This is not good.

Todo

Outcome

hervenicol commented 1 day ago

Actual RAM usage for Alloy on golem

Queries

RAM: sum(container_memory_working_set_bytes{cluster_id="golem", namespace="kube-system", pod=~"alloy-metrics.*", container!="", image!=""}) by (pod)

Series: sum(prometheus_remote_write_wal_storage_active_series{pod=~"alloy-metrics-.*"}) by (pod)

Numbers

Currently:

=> the pod scraping 400k metrics is the one using less RAM. But it's also the youngest one.

Pprof

Extracting data:

Visualizing data with https://play.grafana.org/a/grafana-pyroscope-app/ad-hoc