hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
30.91k stars 4.18k forks source link

Vault telemetry appears to stop reporting some metrics. #8797

Closed sharkannon closed 4 years ago

sharkannon commented 4 years ago

Describe the bug We use the /v1/sys/metrics?format=prometheus endpoint to monitor the health of our vault servers. It seems that every few days it stops reporting some metrics (Not all metrics).

Metrics such as: vault_core_handle_request_sum vault_gcs_put_sum

stop having vaules while vault_gcs_get_sum go_gc_duration_seconds vault_core_unseal

continue to work. Killing all the pods and doing a full restart does seem to fix it.

To Reproduce Steps to reproduce the behavior: I'm unable to reproduce the behaviour on demand.

Expected behavior Vault Metrics continue to show.

Environment:

Vault server configuration file(s):

api_addr     = "https://vault:8200"
cluster_addr = "https://$(POD_IP_ADDR):8201"
log_level = "warn"
ui = true
seal "gcpckms" {
  project    = "vault"
  region     = "us"
  key_ring   = "vault"
  crypto_key = "vault-init"
}
storage "gcs" {
  bucket     = "vault-storage"
  ha_enabled = "true"
}
listener "tcp" {
  address     = "127.0.0.1:8200"
  tls_disable = "true"
}
listener "tcp" {
  address       = "$(POD_IP_ADDR):8200"
  tls_cert_file = "/etc/vault/tls/vault.crt"
  tls_key_file  = "/etc/vault/tls/vault.key"
  tls_disable_client_certs = true
  telemetry {
   unauthenticated_metrics_access = "true"
  }
}
telemetry {
 prometheus_retention_time = "24h"
}
sharkannon commented 4 years ago

The only thing I can see that MAY line up with it, is at about the same time I start losing metrics, vault goes through about 120k failed postgres plugin revokes.

ncabatoff commented 4 years ago

Vault (and HashiCorp products more generally) uses the go-metrics library to allow us to integrate with a variety of metrics systems. Unfortunately that means that we don't behave "normally" from a Prometheus perspective. Most Prometheus metrics sources will register a metric and then serve it forever: even if there's no activity in that metric at all, samples will continue to be emitted. That is not the case for Vault: once a metric has seen no activity, it will disappear after prometheus_retention_time.

sharkannon commented 4 years ago

Unfortunately that doesn't appear to be the case for 2 reasons.

  1. The metrics in question are displaying metric changes up until they disappear
  2. They never come back. (these are active metrics, we have systems using vault continuously)
sharkannon commented 4 years ago

That's a sample of the metrics I'm getting atm, none of the other expected metrics are showing.

{
"Timestamp": "2020-04-22 16:28:50 +0000 UTC",
"Gauges": [
{
"Name": "vault.runtime.alloc_bytes",
"Value": 7937056,
"Labels": {}
},
{
"Name": "vault.runtime.free_count",
"Value": 375333020,
"Labels": {}
},
{
"Name": "vault.runtime.heap_objects",
"Value": 40433,
"Labels": {}
},
{
"Name": "vault.runtime.malloc_count",
"Value": 375373440,
"Labels": {}
},
{
"Name": "vault.runtime.num_goroutines",
"Value": 43,
"Labels": {}
},
{
"Name": "vault.runtime.sys_bytes",
"Value": 73072890,
"Labels": {}
},
{
"Name": "vault.runtime.total_gc_pause_ns",
"Value": 354362600,
"Labels": {}
},
{
"Name": "vault.runtime.total_gc_runs",
"Value": 5792,
"Labels": {}
}
],
"Points": [],
"Counters": [],
"Samples": [
{
"Name": "vault.barrier.get",
"Count": 12,
"Rate": 42.08301467895508,
"Sum": 420.8301467895508,
"Min": 22.739765167236328,
"Max": 60.665557861328125,
"Mean": 35.069178899129234,
"Stddev": 10.686371258256065,
"Labels": {}
},
{
"Name": "vault.gcs.get",
"Count": 12,
"Rate": 42.01195087432861,
"Sum": 420.11950874328613,
"Min": 22.67365264892578,
"Max": 60.60808181762695,
"Mean": 35.00995906194051,
"Stddev": 10.691047537981518,
"Labels": {}
},
{
"Name": "vault.gcs.lock.value",
"Count": 4,
"Rate": 8.349518203735352,
"Sum": 83.49518203735352,
"Min": 18.19257926940918,
"Max": 22.06221580505371,
"Mean": 20.87379550933838,
"Stddev": 1.8152046115446498,
"Labels": {}
},
{
"Name": "vault.runtime.gc_pause_ns",
"Count": 1,
"Rate": 2179.4,
"Sum": 21794,
"Min": 21794,
"Max": 21794,
"Mean": 21794,
"Stddev": 0,
"Labels": {}
}
]
}
ncabatoff commented 4 years ago

Can you try sending a SIGUSR1 and sharing the metrics found in the logs?

sharkannon commented 4 years ago

I have had to sanitize some of the data in there, but hopefully still useful (only 3 or 4 metrics)


[2020-04-22 16:38:40 +0000 UTC][G] 'vault.runtime.malloc_count': 5780418560.000
[2020-04-22 16:38:40 +0000 UTC][G] 'vault.runtime.heap_objects': 235072.000
[2020-04-22 16:38:40 +0000 UTC][G] 'vault.runtime.total_gc_pause_ns': 4930893312.000
[2020-04-22 16:38:40 +0000 UTC][G] 'vault.runtime.num_goroutines': 735.000
[2020-04-22 16:38:40 +0000 UTC][G] 'vault.runtime.alloc_bytes': 82270944.000
[2020-04-22 16:38:40 +0000 UTC][G] 'vault.runtime.sys_bytes': 632258816.000
[2020-04-22 16:38:40 +0000 UTC][G] 'vault.runtime.free_count': 5780183552.000
[2020-04-22 16:38:40 +0000 UTC][G] 'vault.runtime.total_gc_runs': 25917.000
[2020-04-22 16:38:40 +0000 UTC][C] 'vault.audit.log_response_failure': Count: 1 Sum: 0.000 LastUpdated: 2020-04-22 16:38:42.573870905 +0000 UTC m=+67653.718150218
[2020-04-22 16:38:40 +0000 UTC][C] 'vault.database.RevokeUser': Count: 2 Sum: 2.000 LastUpdated: 2020-04-22 16:38:45.955998874 +0000 UTC m=+67657.100278172
[2020-04-22 16:38:40 +0000 UTC][C] 'vault.database.postgres.RevokeUser': Count: 2 Sum: 2.000 LastUpdated: 2020-04-22 16:38:45.95600695 +0000 UTC m=+67657.100286249
[2020-04-22 16:38:40 +0000 UTC][C] 'vault.audit.log_request_failure': Count: 1 Sum: 0.000 LastUpdated: 2020-04-22 16:38:41.338867862 +0000 UTC m=+67652.483147165
[2020-04-22 16:38:40 +0000 UTC][C] 'vault.database.CreateUser': Count: 1 Sum: 1.000 LastUpdated: 2020-04-22 16:38:41.339089919 +0000 UTC m=+67652.483369223
[2020-04-22 16:38:40 +0000 UTC][C] 'vault.database.postgres.CreateUser': Count: 1 Sum: 1.000 LastUpdated: 2020-04-22 16:38:41.33909926 +0000 UTC m=+67652.483378567
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.audit.log_request': Count: 1 Sum: 0.017 LastUpdated: 2020-04-22 16:38:41.33887694 +0000 UTC m=+67652.483156245
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.gcs.delete': Count: 4 Min: 114.339 Mean: 116.546 Max: 117.813 Stddev: 1.538 Sum: 466.184 LastUpdated: 2020-04-22 16:38:46.819305176 +0000 UTC m=+67657.963584474
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.gcs.get': Count: 2 Min: 79.626 Mean: 87.999 Max: 96.372 Stddev: 11.841 Sum: 175.999 LastUpdated: 2020-04-22 16:38:45.955721016 +0000 UTC m=+67657.100000336
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.barrier.delete': Count: 4 Min: 114.397 Mean: 116.599 Max: 117.867 Stddev: 1.535 Sum: 466.396 LastUpdated: 2020-04-22 16:38:46.819314441 +0000 UTC m=+67657.963593736
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.expire.fetch-lease-times': Count: 2 Min: 0.002 Mean: 0.003 Max: 0.004 Stddev: 0.001 Sum: 0.006 LastUpdated: 2020-04-22 16:38:42.573812019 +0000 UTC m=+67653.718091333
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.core.fetch_acl_and_token': Count: 1 Sum: 1.199 LastUpdated: 2020-04-22 16:38:41.338815306 +0000 UTC m=+67652.483094622
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.gcs.put': Count: 2 Min: 122.699 Mean: 128.735 Max: 134.771 Stddev: 8.537 Sum: 257.470 LastUpdated: 2020-04-22 16:38:42.57373384 +0000 UTC m=+67653.718013165
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.expire.register': Count: 1 Sum: 257.820 LastUpdated: 2020-04-22 16:38:42.573796947 +0000 UTC m=+67653.718076266
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.audit.log_response': Count: 1 Sum: 0.035 LastUpdated: 2020-04-22 16:38:42.573880573 +0000 UTC m=+67653.718159887
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.runtime.gc_pause_ns': Count: 1 Sum: 67699.000 LastUpdated: 2020-04-22 16:38:42.754495926 +0000 UTC m=+67653.898775233
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.expire.revoke': Count: 2 Min: 905.557 Mean: 924.449 Max: 943.341 Stddev: 26.717 Sum: 1848.898 LastUpdated: 2020-04-22 16:38:46.819352695 +0000 UTC m=+67657.963631996
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.core.check_token': Count: 1 Sum: 1.263 LastUpdated: 2020-04-22 16:38:41.338848563 +0000 UTC m=+67652.483127976
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.database.postgres.CreateUser': Count: 1 Sum: 976.807 LastUpdated: 2020-04-22 16:38:42.315900479 +0000 UTC m=+67653.460179785
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.core.handle_request': Count: 1 Sum: 1236.243 LastUpdated: 2020-04-22 16:38:42.573824466 +0000 UTC m=+67653.718103784
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.database.postgres.RevokeUser': Count: 2 Min: 573.530 Mean: 602.803 Max: 632.077 Stddev: 41.399 Sum: 1205.607 LastUpdated: 2020-04-22 16:38:46.588075408 +0000 UTC m=+67657.732354705
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.route.revoke.postgres-sanatized': Count: 2 Min: 573.643 Mean: 602.923 Max: 632.204 Stddev: 41.409 Sum: 1205.847 LastUpdated: 2020-04-22 16:38:46.588093491 +0000 UTC m=+67657.732372794
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.expire.revoke-common': Count: 2 Min: 905.530 Mean: 924.424 Max: 943.317 Stddev: 26.719 Sum: 1848.847 LastUpdated: 2020-04-22 16:38:46.819342328 +0000 UTC m=+67657.963621627
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.barrier.put': Count: 2 Min: 122.785 Mean: 128.813 Max: 134.840 Stddev: 8.524 Sum: 257.625 LastUpdated: 2020-04-22 16:38:42.573755737 +0000 UTC m=+67653.718035051
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.database.RevokeUser': Count: 2 Min: 573.483 Mean: 602.759 Max: 632.035 Stddev: 41.402 Sum: 1205.518 LastUpdated: 2020-04-22 16:38:46.588061476 +0000 UTC m=+67657.732340781
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.barrier.get': Count: 13 Min: 0.013 Mean: 13.578 Max: 96.431 Stddev: 33.232 Sum: 176.515 LastUpdated: 2020-04-22 16:38:47.52051179 +0000 UTC m=+67658.664791107
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.expire.fetch-lease-times-by-token': Count: 1 Sum: 0.039 LastUpdated: 2020-04-22 16:38:41.337540289 +0000 UTC m=+67652.481819744
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.token.lookup': Count: 1 Sum: 0.294 LastUpdated: 2020-04-22 16:38:41.337549577 +0000 UTC m=+67652.481828978
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.policy.get_policy': Count: 2 Min: 0.003 Mean: 0.007 Max: 0.010 Stddev: 0.005 Sum: 0.013 LastUpdated: 2020-04-22 16:38:41.337633787 +0000 UTC m=+67652.481913093
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.database.CreateUser': Count: 1 Sum: 976.749 LastUpdated: 2020-04-22 16:38:42.315885951 +0000 UTC m=+67653.460165267
[2020-04-22 16:38:40 +0000 UTC][S] 'vault.route.read.postgres-sanatized': Count: 1 Sum: 977.039 LastUpdated: 2020-04-22 16:38:42.315930861 +0000 UTC m=+67653.460210157
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.runtime.alloc_bytes': 56330064.000
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.runtime.sys_bytes': 632258816.000
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.runtime.malloc_count': 5780481024.000
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.runtime.heap_objects': 230827.000
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.expire.num_leases': 2922.000
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.runtime.num_goroutines': 735.000
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.runtime.free_count': 5780250112.000
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.runtime.total_gc_pause_ns': 4931279872.000
[2020-04-22 16:38:50 +0000 UTC][G] 'vault.runtime.total_gc_runs': 25918.000
[2020-04-22 16:38:50 +0000 UTC][C] 'vault.audit.log_request_failure': Count: 10 Sum: 0.000 LastUpdated: 2020-04-22 16:38:59.68495394 +0000 UTC m=+67670.829233255
[2020-04-22 16:38:50 +0000 UTC][C] 'vault.audit.log_response_failure': Count: 10 Sum: 0.000 LastUpdated: 2020-04-22 16:38:59.685098041 +0000 UTC m=+67670.829377339
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.route.read.postgres-sanatized': Count: 10 Min: 0.051 Mean: 0.065 Max: 0.081 Stddev: 0.009 Sum: 0.651 LastUpdated: 2020-04-22 16:38:59.6850668 +0000 UTC m=+67670.829346115
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.gcs.delete': Count: 4 Min: 9.593 Mean: 91.096 Max: 126.957 Stddev: 54.700 Sum: 364.385 LastUpdated: 2020-04-22 16:38:57.746692798 +0000 UTC m=+67668.890972116
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.expire.revoke': Count: 1 Sum: 679.039 LastUpdated: 2020-04-22 16:38:57.746740097 +0000 UTC m=+67668.891019395
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.expire.fetch-lease-times-by-token': Count: 14 Min: 0.023 Mean: 0.042 Max: 0.171 Stddev: 0.038 Sum: 0.593 LastUpdated: 2020-04-22 16:38:59.905528404 +0000 UTC m=+67671.049807699
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.core.check_token': Count: 10 Min: 0.683 Mean: 1.618 Max: 8.175 Stddev: 2.310 Sum: 16.179 LastUpdated: 2020-04-22 16:38:59.684936696 +0000 UTC m=+67670.829216010
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.barrier.put': Count: 1 Sum: 162.792 LastUpdated: 2020-04-22 16:38:57.317364413 +0000 UTC m=+67668.461643723
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.expire.fetch-lease-times': Count: 14 Min: 0.002 Mean: 0.005 Max: 0.020 Stddev: 0.005 Sum: 0.064 LastUpdated: 2020-04-22 16:38:59.905522121 +0000 UTC m=+67671.049801417
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.policy.get_policy': Count: 20 Min: 0.002 Mean: 0.006 Max: 0.011 Stddev: 0.003 Sum: 0.125 LastUpdated: 2020-04-22 16:38:59.67682888 +0000 UTC m=+67670.821108186
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.core.fetch_acl_and_token': Count: 10 Min: 0.658 Mean: 1.580 Max: 8.120 Stddev: 2.304 Sum: 15.797 LastUpdated: 2020-04-22 16:38:59.684908593 +0000 UTC m=+67670.829187921
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.core.handle_request': Count: 10 Min: 0.772 Mean: 1.736 Max: 8.319 Stddev: 2.320 Sum: 17.363 LastUpdated: 2020-04-22 16:38:59.685076452 +0000 UTC m=+67670.829355765
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.barrier.delete': Count: 4 Min: 9.636 Mean: 91.148 Max: 127.016 Stddev: 54.706 Sum: 364.591 LastUpdated: 2020-04-22 16:38:57.746702833 +0000 UTC m=+67668.890982142
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.expire.revoke-by-token': Count: 1 Sum: 167.475 LastUpdated: 2020-04-22 16:38:57.509044811 +0000 UTC m=+67668.653324111
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.runtime.gc_pause_ns': Count: 1 Sum: 386871.000 LastUpdated: 2020-04-22 16:38:59.761266824 +0000 UTC m=+67670.905546132
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.barrier.get': Count: 34 Min: 0.011 Mean: 4.023 Max: 61.769 Stddev: 14.318 Sum: 136.799 LastUpdated: 2020-04-22 16:38:59.905441944 +0000 UTC m=+67671.049721240
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.audit.log_request': Count: 10 Min: 0.010 Mean: 0.014 Max: 0.023 Stddev: 0.004 Sum: 0.144 LastUpdated: 2020-04-22 16:38:59.684970293 +0000 UTC m=+67670.829249607
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.audit.log_response': Count: 10 Min: 0.006 Mean: 0.010 Max: 0.019 Stddev: 0.004 Sum: 0.100 LastUpdated: 2020-04-22 16:38:59.685106084 +0000 UTC m=+67670.829385402
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.gcs.get': Count: 3 Min: 17.853 Mean: 45.300 Max: 61.722 Stddev: 23.922 Sum: 135.901 LastUpdated: 2020-04-22 16:38:59.878590299 +0000 UTC m=+67671.022869658
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.gcs.list': Count: 4 Min: 22.150 Mean: 25.639 Max: 29.917 Stddev: 3.338 Sum: 102.557 LastUpdated: 2020-04-22 16:38:59.905369504 +0000 UTC m=+67671.049648801
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.barrier.list': Count: 4 Min: 22.197 Mean: 25.679 Max: 29.962 Stddev: 3.336 Sum: 102.715 LastUpdated: 2020-04-22 16:38:59.905412319 +0000 UTC m=+67671.049691639
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.gcs.put': Count: 1 Sum: 162.712 LastUpdated: 2020-04-22 16:38:57.317324103 +0000 UTC m=+67668.461603412
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.token.store': Count: 1 Sum: 162.835 LastUpdated: 2020-04-22 16:38:57.317375173 +0000 UTC m=+67668.461654483
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.token.lookup': Count: 10 Min: 0.237 Mean: 0.624 Max: 3.305 Stddev: 0.946 Sum: 6.245 LastUpdated: 2020-04-22 16:38:59.676729354 +0000 UTC m=+67670.821008653
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.token.revoke-tree': Count: 1 Sum: 612.777 LastUpdated: 2020-04-22 16:38:57.737054195 +0000 UTC m=+67668.881333499
[2020-04-22 16:38:50 +0000 UTC][S] 'vault.expire.revoke-common': Count: 2 Min: 145.188 Mean: 412.106 Max: 679.024 Stddev: 377.479 Sum: 824.212 LastUpdated: 2020-04-22 16:38:57.74671841 +0000 UTC m=+67668.890997707
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.expire.num_leases': 2921.000
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.runtime.num_goroutines': 735.000
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.runtime.sys_bytes': 632258816.000
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.runtime.heap_objects': 229903.000
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.runtime.total_gc_pause_ns': 4931307008.000
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.runtime.alloc_bytes': 64954264.000
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.runtime.malloc_count': 5780511232.000
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.runtime.free_count': 5780281344.000
[2020-04-22 16:39:00 +0000 UTC][G] 'vault.runtime.total_gc_runs': 25919.000
[2020-04-22 16:39:00 +0000 UTC][C] 'vault.audit.log_request_failure': Count: 1 Sum: 0.000 LastUpdated: 2020-04-22 16:39:04.252217172 +0000 UTC m=+67675.396496471
[2020-04-22 16:39:00 +0000 UTC][C] 'vault.database.CreateUser': Count: 1 Sum: 1.000 LastUpdated: 2020-04-22 16:39:04.252414404 +0000 UTC m=+67675.396693703
[2020-04-22 16:39:00 +0000 UTC][C] 'vault.database.postgres.CreateUser': Count: 1 Sum: 1.000 LastUpdated: 2020-04-22 16:39:04.252420647 +0000 UTC m=+67675.396699944
[2020-04-22 16:39:00 +0000 UTC][C] 'vault.audit.log_response_failure': Count: 1 Sum: 0.000 LastUpdated: 2020-04-22 16:39:05.321842139 +0000 UTC m=+67676.466121446
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.audit.log_request': Count: 1 Sum: 0.013 LastUpdated: 2020-04-22 16:39:04.252225412 +0000 UTC m=+67675.396504817
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.route.read.postgres-sanatized': Count: 1 Sum: 882.758 LastUpdated: 2020-04-22 16:39:05.135004337 +0000 UTC m=+67676.279283631
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.barrier.list': Count: 5 Min: 20.430 Mean: 22.824 Max: 25.753 Stddev: 2.280 Sum: 114.122 LastUpdated: 2020-04-22 16:39:01.856240559 +0000 UTC m=+67673.000519856
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.token.revoke-tree': Count: 2 Min: 594.656 Mean: 602.778 Max: 610.899 Stddev: 11.486 Sum: 1205.556 LastUpdated: 2020-04-22 16:39:02.240989851 +0000 UTC m=+67673.385269148
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.expire.revoke': Count: 2 Min: 667.156 Mean: 675.160 Max: 683.164 Stddev: 11.319 Sum: 1350.320 LastUpdated: 2020-04-22 16:39:02.251990138 +0000 UTC m=+67673.396269446
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.core.check_token': Count: 1 Sum: 0.830 LastUpdated: 2020-04-22 16:39:04.252203135 +0000 UTC m=+67675.396482431
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.token.lookup': Count: 1 Sum: 0.253 LastUpdated: 2020-04-22 16:39:04.251344943 +0000 UTC m=+67675.395624240
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.barrier.put': Count: 5 Min: 86.512 Mean: 128.073 Max: 156.000 Stddev: 32.425 Sum: 640.367 LastUpdated: 2020-04-22 16:39:05.32175961 +0000 UTC m=+67676.466038917
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.barrier.delete': Count: 8 Min: 10.426 Mean: 91.497 Max: 125.021 Stddev: 50.162 Sum: 731.976 LastUpdated: 2020-04-22 16:39:02.251949874 +0000 UTC m=+67673.396229183
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.runtime.gc_pause_ns': Count: 1 Sum: 26717.000 LastUpdated: 2020-04-22 16:39:05.763822232 +0000 UTC m=+67676.908101565
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.expire.fetch-lease-times': Count: 4 Min: 0.002 Mean: 0.003 Max: 0.006 Stddev: 0.002 Sum: 0.012 LastUpdated: 2020-04-22 16:39:05.321803105 +0000 UTC m=+67676.466082412
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.database.CreateUser': Count: 1 Sum: 882.502 LastUpdated: 2020-04-22 16:39:05.13495468 +0000 UTC m=+67676.279233976
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.core.handle_request': Count: 1 Sum: 1070.443 LastUpdated: 2020-04-22 16:39:05.321814346 +0000 UTC m=+67676.466093650
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.audit.log_response': Count: 1 Sum: 0.014 LastUpdated: 2020-04-22 16:39:05.321850049 +0000 UTC m=+67676.466129441
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.gcs.list': Count: 5 Min: 20.397 Mean: 22.784 Max: 25.682 Stddev: 2.268 Sum: 113.922 LastUpdated: 2020-04-22 16:39:01.856231054 +0000 UTC m=+67673.000510352
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.expire.revoke-common': Count: 4 Min: 134.390 Mean: 407.896 Max: 683.133 Stddev: 308.686 Sum: 1631.584 LastUpdated: 2020-04-22 16:39:02.251982711 +0000 UTC m=+67673.396262022
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.expire.fetch-lease-times-by-token': Count: 3 Min: 0.033 Mean: 0.043 Max: 0.062 Stddev: 0.016 Sum: 0.130 LastUpdated: 2020-04-22 16:39:04.251337675 +0000 UTC m=+67675.395616970
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.token.store': Count: 2 Min: 144.034 Mean: 148.935 Max: 153.836 Stddev: 6.931 Sum: 297.870 LastUpdated: 2020-04-22 16:39:01.810096388 +0000 UTC m=+67672.954375686
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.policy.get_policy': Count: 2 Min: 0.002 Mean: 0.007 Max: 0.011 Stddev: 0.006 Sum: 0.014 LastUpdated: 2020-04-22 16:39:04.251479093 +0000 UTC m=+67675.395758389
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.expire.revoke-by-token': Count: 2 Min: 156.314 Mean: 164.000 Max: 171.686 Stddev: 10.870 Sum: 328.000 LastUpdated: 2020-04-22 16:39:02.003210455 +0000 UTC m=+67673.147489751
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.core.fetch_acl_and_token': Count: 1 Sum: 0.794 LastUpdated: 2020-04-22 16:39:04.252183597 +0000 UTC m=+67675.396462896
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.gcs.put': Count: 5 Min: 86.455 Mean: 127.991 Max: 155.865 Stddev: 32.411 Sum: 639.957 LastUpdated: 2020-04-22 16:39:05.321748402 +0000 UTC m=+67676.466027722
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.gcs.delete': Count: 8 Min: 10.380 Mean: 91.433 Max: 124.964 Stddev: 50.154 Sum: 731.465 LastUpdated: 2020-04-22 16:39:02.251941021 +0000 UTC m=+67673.396220333
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.gcs.get': Count: 3 Min: 23.551 Mean: 38.695 Max: 61.008 Stddev: 19.731 Sum: 116.084 LastUpdated: 2020-04-22 16:39:01.887819192 +0000 UTC m=+67673.032098490
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.database.postgres.CreateUser': Count: 1 Sum: 882.550 LastUpdated: 2020-04-22 16:39:05.134969375 +0000 UTC m=+67676.279248675
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.expire.register': Count: 1 Sum: 186.747 LastUpdated: 2020-04-22 16:39:05.321788497 +0000 UTC m=+67676.466067804
[2020-04-22 16:39:00 +0000 UTC][S] 'vault.barrier.get': Count: 13 Min: 0.009 Mean: 8.962 Max: 61.070 Stddev: 18.794 Sum: 116.506 LastUpdated: 2020-04-22 16:39:07.52036366 +0000 UTC m=+67678.664642960
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.runtime.malloc_count': 5780554752.000
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.runtime.heap_objects': 268608.000
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.expire.num_leases': 2920.000
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.runtime.alloc_bytes': 85129408.000
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.runtime.free_count': 5780285952.000
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.runtime.total_gc_pause_ns': 4931307008.000
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.runtime.total_gc_runs': 25919.000
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.runtime.num_goroutines': 735.000
[2020-04-22 16:39:10 +0000 UTC][G] 'vault.runtime.sys_bytes': 632258816.000
[2020-04-22 16:39:10 +0000 UTC][C] 'vault.database.postgres.RevokeUser': Count: 1 Sum: 1.000 LastUpdated: 2020-04-22 16:39:13.147080579 +0000 UTC m=+67684.291359874
[2020-04-22 16:39:10 +0000 UTC][C] 'vault.audit.log_request_failure': Count: 6 Sum: 0.000 LastUpdated: 2020-04-22 16:39:17.429714721 +0000 UTC m=+67688.573994039
[2020-04-22 16:39:10 +0000 UTC][C] 'vault.audit.log_response_failure': Count: 6 Sum: 0.000 LastUpdated: 2020-04-22 16:39:17.429841574 +0000 UTC m=+67688.574120875
[2020-04-22 16:39:10 +0000 UTC][C] 'vault.database.RevokeUser': Count: 1 Sum: 1.000 LastUpdated: 2020-04-22 16:39:13.147073313 +0000 UTC m=+67684.291352609
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.route.read.postgres-sanatized': Count: 6 Min: 0.051 Mean: 0.066 Max: 0.073 Stddev: 0.008 Sum: 0.399 LastUpdated: 2020-04-22 16:39:17.429820042 +0000 UTC m=+67688.574099338
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.barrier.delete': Count: 2 Min: 114.155 Mean: 114.307 Max: 114.458 Stddev: 0.214 Sum: 228.614 LastUpdated: 2020-04-22 16:39:13.964965207 +0000 UTC m=+67685.109244503
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.database.RevokeUser': Count: 1 Sum: 589.164 LastUpdated: 2020-04-22 16:39:13.736267286 +0000 UTC m=+67684.880546590
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.token.lookup': Count: 6 Min: 0.223 Mean: 0.313 Max: 0.369 Stddev: 0.064 Sum: 1.876 LastUpdated: 2020-04-22 16:39:17.428500641 +0000 UTC m=+67688.572779950
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.core.fetch_acl_and_token': Count: 6 Min: 0.696 Mean: 0.870 Max: 1.114 Stddev: 0.166 Sum: 5.218 LastUpdated: 2020-04-22 16:39:17.429673135 +0000 UTC m=+67688.573952450
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.core.check_token': Count: 6 Min: 0.751 Mean: 0.908 Max: 1.151 Stddev: 0.161 Sum: 5.446 LastUpdated: 2020-04-22 16:39:17.429697366 +0000 UTC m=+67688.573976680
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.core.handle_request': Count: 6 Min: 0.872 Mean: 1.027 Max: 1.287 Stddev: 0.168 Sum: 6.161 LastUpdated: 2020-04-22 16:39:17.429828919 +0000 UTC m=+67688.574108217
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.barrier.get': Count: 20 Min: 0.009 Mean: 2.520 Max: 49.953 Stddev: 11.165 Sum: 50.397 LastUpdated: 2020-04-22 16:39:17.521428485 +0000 UTC m=+67688.665707797
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.route.revoke.postgres-sanatized': Count: 1 Sum: 589.316 LastUpdated: 2020-04-22 16:39:13.736299038 +0000 UTC m=+67684.880578334
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.expire.revoke-common': Count: 1 Sum: 868.108 LastUpdated: 2020-04-22 16:39:13.964996386 +0000 UTC m=+67685.109275684
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.expire.revoke': Count: 1 Sum: 868.137 LastUpdated: 2020-04-22 16:39:13.965008792 +0000 UTC m=+67685.109288088
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.policy.get_policy': Count: 12 Min: 0.002 Mean: 0.006 Max: 0.011 Stddev: 0.003 Sum: 0.076 LastUpdated: 2020-04-22 16:39:17.428591287 +0000 UTC m=+67688.572870591
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.audit.log_response': Count: 6 Min: 0.007 Mean: 0.009 Max: 0.015 Stddev: 0.003 Sum: 0.055 LastUpdated: 2020-04-22 16:39:17.429848822 +0000 UTC m=+67688.574128134
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.gcs.get': Count: 1 Sum: 49.877 LastUpdated: 2020-04-22 16:39:13.146841931 +0000 UTC m=+67684.291121231
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.gcs.delete': Count: 2 Min: 114.097 Mean: 114.250 Max: 114.404 Stddev: 0.217 Sum: 228.500 LastUpdated: 2020-04-22 16:39:13.9649563 +0000 UTC m=+67685.109235596
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.expire.fetch-lease-times': Count: 6 Min: 0.002 Mean: 0.003 Max: 0.004 Stddev: 0.000 Sum: 0.018 LastUpdated: 2020-04-22 16:39:17.428483053 +0000 UTC m=+67688.572762351
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.expire.fetch-lease-times-by-token': Count: 6 Min: 0.020 Mean: 0.035 Max: 0.045 Stddev: 0.009 Sum: 0.213 LastUpdated: 2020-04-22 16:39:17.428493046 +0000 UTC m=+67688.572772351
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.audit.log_request': Count: 6 Min: 0.011 Mean: 0.012 Max: 0.014 Stddev: 0.002 Sum: 0.074 LastUpdated: 2020-04-22 16:39:17.429722636 +0000 UTC m=+67688.574001950
[2020-04-22 16:39:10 +0000 UTC][S] 'vault.database.postgres.RevokeUser': Count: 1 Sum: 589.206 LastUpdated: 2020-04-22 16:39:13.736280322 +0000 UTC m=+67684.880559639
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.expire.num_leases': 2920.000
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.runtime.alloc_bytes': 82273352.000
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.runtime.malloc_count': 5780583424.000
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.runtime.heap_objects': 234939.000
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.runtime.num_goroutines': 735.000
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.runtime.sys_bytes': 632258816.000
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.runtime.free_count': 5780348416.000
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.runtime.total_gc_pause_ns': 4931344896.000
[2020-04-22 16:39:20 +0000 UTC][G] 'vault.runtime.total_gc_runs': 25920.000
[2020-04-22 16:39:20 +0000 UTC][C] 'vault.audit.log_request_failure': Count: 1 Sum: 0.000 LastUpdated: 2020-04-22 16:39:23.927750613 +0000 UTC m=+67695.072029913
[2020-04-22 16:39:20 +0000 UTC][C] 'vault.audit.log_response_failure': Count: 1 Sum: 0.000 LastUpdated: 2020-04-22 16:39:24.625717055 +0000 UTC m=+67695.769996354
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.core.check_token': Count: 1 Sum: 0.072 LastUpdated: 2020-04-22 16:39:23.927732444 +0000 UTC m=+67695.072011794
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.token.createAccessor': Count: 1 Sum: 109.873 LastUpdated: 2020-04-22 16:39:24.4350413 +0000 UTC m=+67695.579320597
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.expire.register-auth': Count: 1 Sum: 94.570 LastUpdated: 2020-04-22 16:39:24.625682941 +0000 UTC m=+67695.769962238
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.gcs.get': Count: 2 Min: 21.236 Mean: 42.826 Max: 64.416 Stddev: 30.533 Sum: 85.652 LastUpdated: 2020-04-22 16:39:27.822339184 +0000 UTC m=+67698.966618499
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.route.update.auth-gcp-': Count: 2 Min: 0.031 Mean: 198.659 Max: 397.288 Stddev: 280.904 Sum: 397.319 LastUpdated: 2020-04-22 16:39:24.325083122 +0000 UTC m=+67695.469362421
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.runtime.gc_pause_ns': Count: 1 Sum: 38259.000 LastUpdated: 2020-04-22 16:39:24.771575331 +0000 UTC m=+67695.915854645
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.expire.fetch-lease-times': Count: 2 Min: 0.002 Mean: 0.003 Max: 0.004 Stddev: 0.001 Sum: 0.006 LastUpdated: 2020-04-22 16:39:27.601821075 +0000 UTC m=+67698.746100382
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.expire.fetch-lease-times-by-token': Count: 2 Min: 0.034 Mean: 0.045 Max: 0.056 Stddev: 0.015 Sum: 0.091 LastUpdated: 2020-04-22 16:39:27.601831426 +0000 UTC m=+67698.746110727
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.gcs.list': Count: 3 Min: 17.682 Mean: 23.026 Max: 28.908 Stddev: 5.632 Sum: 69.078 LastUpdated: 2020-04-22 16:39:27.800996212 +0000 UTC m=+67698.945275519
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.expire.revoke': Count: 1 Sum: 666.863 LastUpdated: 2020-04-22 16:39:28.185930024 +0000 UTC m=+67699.330209319
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.barrier.put': Count: 4 Min: 94.466 Mean: 111.940 Max: 147.491 Stddev: 24.690 Sum: 447.762 LastUpdated: 2020-04-22 16:39:27.749370502 +0000 UTC m=+67698.893649798
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.token.store': Count: 1 Sum: 147.540 LastUpdated: 2020-04-22 16:39:27.749426138 +0000 UTC m=+67698.893705437
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.expire.revoke-common': Count: 2 Min: 139.582 Mean: 403.210 Max: 666.838 Stddev: 372.826 Sum: 806.420 LastUpdated: 2020-04-22 16:39:28.185922479 +0000 UTC m=+67699.330201775
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.expire.revoke-by-token': Count: 1 Sum: 162.216 LastUpdated: 2020-04-22 16:39:27.940671434 +0000 UTC m=+67699.084950832
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.audit.log_response': Count: 1 Sum: 0.010 LastUpdated: 2020-04-22 16:39:24.625723447 +0000 UTC m=+67695.770002745
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.audit.log_request': Count: 1 Sum: 0.016 LastUpdated: 2020-04-22 16:39:23.927759095 +0000 UTC m=+67695.072038425
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.gcs.put': Count: 4 Min: 94.411 Mean: 111.874 Max: 147.421 Stddev: 24.686 Sum: 447.497 LastUpdated: 2020-04-22 16:39:27.749354562 +0000 UTC m=+67698.893633862
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.token.create': Count: 1 Sum: 205.926 LastUpdated: 2020-04-22 16:39:24.531087841 +0000 UTC m=+67695.675367138
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.core.handle_login_request': Count: 1 Sum: 698.043 LastUpdated: 2020-04-22 16:39:24.625698582 +0000 UTC m=+67695.769977880
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.barrier.list': Count: 3 Min: 17.727 Mean: 23.086 Max: 28.996 Stddev: 5.655 Sum: 69.257 LastUpdated: 2020-04-22 16:39:27.801023644 +0000 UTC m=+67698.945302940
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.gcs.delete': Count: 4 Min: 11.930 Mean: 90.754 Max: 118.744 Stddev: 52.585 Sum: 363.015 LastUpdated: 2020-04-22 16:39:28.185887802 +0000 UTC m=+67699.330167098
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.barrier.delete': Count: 4 Min: 11.969 Mean: 90.806 Max: 118.801 Stddev: 52.594 Sum: 363.225 LastUpdated: 2020-04-22 16:39:28.185895586 +0000 UTC m=+67699.330174883
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.barrier.get': Count: 12 Min: 0.018 Mean: 7.174 Max: 64.477 Stddev: 19.052 Sum: 86.093 LastUpdated: 2020-04-22 16:39:29.114743519 +0000 UTC m=+67700.259022815
[2020-04-22 16:39:20 +0000 UTC][S] 'vault.token.revoke-tree': Count: 1 Sum: 590.143 LastUpdated: 2020-04-22 16:39:28.173916574 +0000 UTC m=+67699.318195871```
sharkannon commented 4 years ago

actually, I think I may know the issue. I'm using a kubernets service that's pointing to ALL the nodes (including the secondaries). It appears that that doesn't work and that only the master reports all the metrics. Sorry for the issues.