Open anoncam opened 3 years ago
To clarify: the issue only occurs when interacting with the raft API endpoints.
Not sure I follow you, so bear with me: You have 5 nodes in HA cluster, and you use Istio and wanted to what? LB to all 5 at once? That will not work by vault design - there is one active node (leader) and rest of nodes are standby nodes forwarding requests to active node.
To be highly available, one of the Vault server nodes grabs a lock within the data store. The successful server node then becomes the active node; all other nodes become standby nodes. At this point, if the standby nodes receive a request, they will either forward the request or redirect the client depending on the current configuration and state of the cluster -- see the sections below for details. Due to this architecture, HA does not enable increased scalability. In general, the bottleneck of Vault is the data store itself, not Vault core. For example: to increase the scalability of Vault with Consul, you would generally scale Consul instead of Vault.
source: https://www.vaultproject.io/docs/concepts/ha#high-availability-mode-ha
Not sure I follow you, so bear with me: You have 5 nodes in HA cluster, and you use Istio and wanted to what? LB to all 5 at once? That will not work by vault design - there is one active node (leader) and rest of nodes are standby nodes forwarding requests to active node.
To be highly available, one of the Vault server nodes grabs a lock within the data store. The successful server node then becomes the active node; all other nodes become standby nodes. At this point, if the standby nodes receive a request, they will either forward the request or redirect the client depending on the current configuration and state of the cluster -- see the sections below for details. Due to this architecture, HA does not enable increased scalability. In general, the bottleneck of Vault is the data store itself, not Vault core. For example: to increase the scalability of Vault with Consul, you would generally scale Consul instead of Vault.
source: https://www.vaultproject.io/docs/concepts/ha#high-availability-mode-ha
Based on vault chart template, the chart should use server.ha.replicas to set in stateful set if server.dev.enabled and server.standalone.enabled are false. Suppose your viewpoint is correct. At least we should see replicas in server stateful set should be the same as the number of server.replicas. I am struggling how to change stateful set replicas using this helm chart.
Not sure I follow you, so bear with me: You have 5 nodes in HA cluster, and you use Istio and wanted to what? LB to all 5 at once? That will not work by vault design - there is one active node (leader) and rest of nodes are standby nodes forwarding requests to active node.
To be highly available, one of the Vault server nodes grabs a lock within the data store. The successful server node then becomes the active node; all other nodes become standby nodes. At this point, if the standby nodes receive a request, they will either forward the request or redirect the client depending on the current configuration and state of the cluster -- see the sections below for details. Due to this architecture, HA does not enable increased scalability. In general, the bottleneck of Vault is the data store itself, not Vault core. For example: to increase the scalability of Vault with Consul, you would generally scale Consul instead of Vault.
source: https://www.vaultproject.io/docs/concepts/ha#high-availability-mode-haBased on vault chart template, the chart should use server.ha.replicas to set in stateful set if server.dev.enabled and server.standalone.enabled are false. Suppose your viewpoint is correct. At least we should see replicas in server stateful set should be the same as the number of server.replicas. I am struggling how to change stateful set replicas using this helm chart.
Changing ha.replicas works for me to scale the service. The template does indeed use that value correctly as can be seen in the repository. I think the OP is confusing horizontal scalability with high availability.
Describe the bug The Vault service in HA mode does not support loadbalancing.
To Reproduce Steps to reproduce the behavior:
curl
or use the vault cli to enable raft snapshotsExpected behavior HA capability to interact with raft storage.
Environment
Chart values: (vault config)
Additional context
The Istio VirtualService
Once the virtual service was changed to point to
vault-active
everything worked as expected, but we have 4 stale pods, which isn't really HA anymore.