buildkite / buildkite-agent-metrics

A command-line tool (and Lambda) for collecting Buildkite agent metrics
MIT License
67 stars 50 forks source link

fix for prometheus backend panic'ing when using unclustered agents #288

Closed wolfeidau closed 2 months ago

wolfeidau commented 2 months ago

If a customer is running unclustered agents and using the prometheus backend then the CLI will panic when building the metrics.

panic: inconsistent label cardinality: expected 1 label values but got 0 in []string(nil)

goroutine 1 [running]:
github.com/prometheus/client_golang/prometheus.(*GaugeVec).WithLabelValues(...)
        github.com/prometheus/client_golang@v1.19.0/prometheus/gauge.go:238
github.com/buildkite/buildkite-agent-metrics/v5/backend.(*Prometheus).Collect(0x14000144618, 0x140008187c0)
        github.com/buildkite/buildkite-agent-metrics/v5/backend/prometheus.go:65 +0x8b0
main.main.func1()
        github.com/buildkite/buildkite-agent-metrics/v5/main.go:182 +0xc4
main.main()
        github.com/buildkite/buildkite-agent-metrics/v5/main.go:194 +0x1160

This change leaves the pruning of empty cluster names to prometheus, for example this is buildkite_total_idle_agent_count with an example of clustered and unclustered agents.

buildkite_total_idle_agent_count{instance="whatever:9999", job="prometheus"} 
buildkite_total_idle_agent_count{cluster="Default cluster", instance="whatever:9999", job="prometheus"}