pulumi / pulumi-kubernetes-operator

A Kubernetes Operator that automates the deployment of Pulumi Stacks
Apache License 2.0
220 stars 54 forks source link

Add metrics for currently reconciling stacks #576

Open nicu-da opened 5 months ago

nicu-da commented 5 months ago

Hello!

Issue details

Affected area/feature

Operator metrics

Add a metric that is similar to the stacks_failing

stacks_failing - a set of gauge time series, labelled by namespace, that gives the number of stacks currently failing (stack.status.lastUpdate.state is failed)

that tracks the stacks that are currently being reconciled.

Compared to controller_runtime_active_workers the new metric should be labeled with the stack name, thus allowing visualization for what metrics are currently being updated, and to add alerts if a stack takes too long to update.

cleverguy25 commented 2 months ago

Added to epic https://github.com/pulumi/pulumi-kubernetes-operator/issues/586