sys/health reports healthy node but raft has cleaned it up

mhristof commented 1 year ago

Describe the bug a vault node reports 200 in sys/health and the rest of the cluster has removed it since it was reported as raft unhealthy.

To Reproduce Steps to reproduce the behavior:

setup a vault cluster with 5 nodes
configure autopilot with vault operator raft autopilot set-config \ -cleanup-dead-servers=true \ -min-quorum=3 \ -dead-server-last-contact-threshold=20s \ -max-trailing-logs=5000
setup ALB infront of the cluster with a healthcheck of /v1/sys/health endpoint and status codes 200,429
scale the cluster up and down every 10 mins and wait for the node node to be removed from the raft cluster but still report 200.

Expected behavior sys/health reports anything but 200 so the node can be removed from the autoscaling group

Environment:

Vault Server Version (retrieve with vault status):

# from the unhealthy node
curl http://localhost:8200/v1/sys/health | jq .
{
"initialized": true,
"sealed": false,
"standby": true,
"performance_standby": false,
"replication_performance_mode": "disabled",
"replication_dr_mode": "disabled",
"server_time_utc": 1668342570,
"version": "1.12.1",
"cluster_name": "vault-cluster-b22df693",
"cluster_id": "0c104d42-7a2c-e67a-478a-cc44111e80a8"
}
$ vault status
Key                      Value
---                      -----
Recovery Seal Type       shamir
Initialized              true
Sealed                   false
Total Recovery Shares    5
Threshold                3
Version                  1.12.1
Build Date               2022-10-27T12:32:05Z
Storage Type             raft
Cluster Name             vault-cluster-b22df693
Cluster ID               0c104d42-7a2c-e67a-478a-cc44111e80a8
HA Enabled               true
HA Cluster               https://172.31.15.83:8201
HA Mode                  standby
Active Node Address      http://172.31.15.83:8200
Raft Committed Index     552230
Raft Applied Index       552230

From an external node checking the cluster:

# vault operator raft list-peers
Node                   Address               State       Voter
----                   -------               -----       -----
i-00caeed7f6d95b98b    172.31.15.83:8201     leader      true
i-0686805d9fdfbf415    172.31.40.25:8201     follower    true
i-08fdbf3c65a687487    172.31.13.128:8201    follower    true
i-078e1fb81d15c62a1    172.31.20.76:8201     follower    true
# vault read -format json sys/storage/raft/autopilot/state | jq '.data.servers | to_entries[] | .value | {"name": .name, "last_index": .last_index, "last_contact": .last_index, "healty": .healthy}' -c
{"name":"i-00caeed7f6d95b98b","last_index":554826,"last_contact":554826,"healty":true}
{"name":"i-0686805d9fdfbf415","last_index":554255,"last_contact":554255,"healty":true}
{"name":"i-078e1fb81d15c62a1","last_index":554269,"last_contact":554269,"healty":true}
{"name":"i-08fdbf3c65a687487","last_index":554261,"last_contact":554261,"healty":true}
{"name":"i-0d8a130272ed889ef","last_index":0,"last_contact":0,"healty":false}
Node                   Address               State       Voter
----                   -------               -----       -----
i-00caeed7f6d95b98b    172.31.15.83:8201     leader      true
i-0686805d9fdfbf415    172.31.40.25:8201     follower    true
i-08fdbf3c65a687487    172.31.13.128:8201    follower    true
i-078e1fb81d15c62a1    172.31.20.76:8201     follower    true
{"name":"i-00caeed7f6d95b98b","last_index":555067,"last_contact":555067,"healty":true}
{"name":"i-0686805d9fdfbf415","last_index":554834,"last_contact":554834,"healty":true}
{"name":"i-078e1fb81d15c62a1","last_index":554849,"last_contact":554849,"healty":true}
{"name":"i-08fdbf3c65a687487","last_index":554841,"last_contact":554841,"healty":true}

Vault CLI Version (retrieve with vault version):

# vault --version
Vault v1.12.1 (e34f8a14fb7a88af4640b09f3ddbb5646b946d9c), built 2022-10-27T12:32:05Z

Server Operating System/Architecture:

Vault server configuration file(s):

each node as node_id the AWS instance id, ffor example

listener "tcp" {
  address     = "0.0.0.0:8200"
    tls_disable = 1
  }

storage "raft" {
  path = "/data/vault"
  node_id = "i-0a9f85305490d16e9"

  retry_join {
    auto_join = "provider=aws addr_type=private_v4 tag_key=uid tag_value=7"
    auto_join_scheme = "http"
  }
}

disable_mlock = true

max_lease_ttl     = "8760h"
default_lease_ttl = "8760h"

ui = false
cluster_addr = "http://{{ GetPrivateIP }}:8201"
api_addr = "http://{{ GetPrivateIP }}:8200"

Additional context From the faulty node, i can see

$ vault operator raft list-peers
Error reading the raft cluster configuration: Get "http://172.31.15.83:8200/v1/sys/storage/raft/configuration": dial tcp 172.31.15.83:8200: i/o timeout

hghaf099 commented 1 year ago

@mhristof Thanks for filing this ticket. I am trying to understand the issue better. Would you please elaborate more on your setup? Are these Vault servers running in the same host? I see that one node is unhealthy, and it is not the leader node. So, I am wondering to which server the health request is sent? As mentioned in sys/health docs, 200 will be returned only from the active (leader) node. But, it is not clear to which server the health request is sent from the output snippet. Besides, would you please describe how many servers are being removed/added from/to the cluster every 10 minutes? What happens to the active node in that scenario? Is it also being replaced with another node?

heatherezell commented 1 year ago

Hello @mhristof - since it's been a while since we've heard from you on this issue, I'm going to go ahead and close it for now. Please feel free to re-open it or open a new issue as needed. Thanks!

hashicorp / vault

sys/health reports healthy node but raft has cleaned it up #17920