hashicorp / vault-secrets-operator

The Vault Secrets Operator (VSO) allows Pods to consume Vault secrets natively from Kubernetes Secrets.
https://hashicorp.com
Other
471 stars 102 forks source link

VSO controller_resource_status metrics not updating #932

Open luizrojo opened 1 month ago

luizrojo commented 1 month ago

Describe the bug

The controller_resource_status metrics, which contain metrics on the auth and connections status from the VSO to a Vault instance do not get updated.

Metrics for controller="vaultconnection" and controller="vaultauth" experience this issue.

From the observed behavior, the metric only gets updated when going from a failure state to a working state, the other way around doesn't happen.

To Reproduce

Steps to reproduce the behavior:

  1. Deploy the Vault Secrets Operator to the Kubernetes cluster with proper connection configurations in place, like network policies;
  2. Wait for the VSO to start, run the metric related checks and start exposing the metrics;
  3. Validate that the controller_resource_status metrics show a 1 value, meaning that the VSO was able to connect to Vault;
  4. Drop the network policies (or network connectivity) to Vault;
  5. Watch the metrics, and see that they are not being update to a 0 value, meaning that the connection is failing.

Here is a screenshot of a dashboard to visualize the behavior

Screenshot 2024-09-24 at 11 17 47

Screenshot 2024-09-24 at 11 27 01

Application deployment:

There is no application involved in this case, since this is a VSO issue.

Expected behavior

When the connectivity to Vault becomes unavailable, the metrics should be updated to show the actual status.

Environment

Additional context

We are moving to the latest available version at this moment (0.8.1), but there are no references on the changelog for metrics or observability that would indicate this being fixed or improved.