Open iveahugeship opened 1 year ago
For clarity, I'd like to state up front that I'm not a HashiCorp employee, just a contributor who happens to have worked with Vault and Prometheus before.
This is a new presentation of the same underlying issue talked about in #11988.
Vault's default behaviour when using Prometheus metrics is really quite unhelpful, with many undocumented/underdocumented pitfalls.
HashiCorp people: Is there any way we could reignite the stalled conversations in hashicorp/go-metrics#136, hashicorp/consul#13495, hashicorp/consul#13498, and #11988 (yes, some of those are Consul issues, but it applies to both products)?
@maxb thanks for your reply.
I've found that this metrics doesn't work properly:
vault.expire.[renew,revoke]
it always returns NaN
vault.token.[renew,revoke]
it always returns NaN
vault.ha.rpc.client.*
it does not exsistvault_raft_state_leader
and vault_raft_state_candidate
it always returns 1
Describe the bug Hello!
By documentation
vault.core.active
returns a value 1 when the vault node is active, and 0 when node is in standby. But the problem is it returns unexpected values. It returns 1 if I use this metric withcluster
label and 0 without this one.Element | Value -- | -- vault_core_active{cluster="vault-cluster-85407450",instance="X.X.X.X:8200",job="vault"} | 1 vault_core_active{cluster="vault-cluster-85407450",instance="X.X.X.X:8200",job="vault"} | 1 vault_core_active{cluster="vault-cluster-85407450",instance="X.X.X.X:8200",job="vault"} | 1 vault_core_active{instance="X.X.X.X:8200",job="vault"} | 0 vault_core_active{instance="X.X.X.X:8200",job="vault"} | 0 vault_core_active{instance="X.X.X.X:8200",job="vault"} | 0
But consult tags shows the real states. 1 node is active and others are standby.
To Reproduce Steps to reproduce the behavior:
vault_core_active
.Expected behavior A value 0 for standby and a value 1 for active nodes.
Environment:
Vault server configuration file(s):
Additional context Prometheus version: 2.15.2