Open Rotfuks opened 4 days ago
golem
RAM: sum(container_memory_working_set_bytes{cluster_id="golem", namespace="kube-system", pod=~"alloy-metrics.*", container!="", image!=""}) by (pod)
Series: sum(prometheus_remote_write_wal_storage_active_series{pod=~"alloy-metrics-.*"}) by (pod)
Currently:
=> the pod scraping 400k metrics is the one using less RAM. But it's also the youngest one.
Extracting data:
ks port-forward alloy-metrics-0 12345
curl localhost:12345/debug/pprof/heap -o heap.pprof
Visualizing data with https://play.grafana.org/a/grafana-pyroscope-app/ad-hoc
alloy-metrics-0:
alloy-metrics-1:
Motivation
We've seen some weird numbers when checking the golem installation on which alloy is already rolled out. It seems that alloy is using significantly more resources than promtail and prometheus-agent combined. This is not good.
Todo
Outcome