Open thomassandslyst opened 6 months ago
Any nudge on this? Is there anything you'd like me to do to get this sorted?
I would love to see this change, it would make tracking workflow execution times in Grafana much easier 🙏
plus one, please review and accept this PR. Emitting high cardinality metrics like this is explicitly discouraged by the prometheus/client_golang maintainers. This also effectively leads to unbounded memory growth unless pods are restarted.
https://github.com/prometheus/client_golang/issues/748 https://github.com/prometheus/client_golang/discussions/920
This is to solve https://github.com/actions/actions-runner-controller/issues/3153
This removes runner_id, runner_name, and job_workflow_ref from the job_startup_duration_seconds and job_execution_duration_seconds metrics to reduce cardinality and allow histograms to be produced from them, with the idea that startup and execution data will be stored in "per repo + workflow" buckets.
I'm unsure whether removing
labelKeyJobWorkflowRef
fromjobLabels
is suitable or if this should be reworked more to come up with more suitable lists.