Closed drawks closed 1 month ago
Thanks, @drawks! Always appreciate seeing your comments and contributions. We'll get this in front of the engineering team soon. :)
Have you seen https://www.vaultproject.io/docs/concepts/integrated-storage#manual-recovery-using-peers-json ? I'm not 100% certain that it works for a ha-only raft cluster, but I think it does.
Otherwise, I'm not clear on the use case. How did you get into this situation where the vault.db and raft.db were missing on all nodes? It might be that this is far enough outside of realistic scenarios that manually editing the storage backend is the right approach.
@ncabatoff As I mentioned I was doing a disaster recovery exercise. The setup was simulating a complete standup using ONLY a backup of the database. So the new machines had no preserved local storage and were also being brought up in a new network segment. The presumption being that when using raft for ha_storage
only that the cluster could be trivially recreated using the contents of the storage
backend. Which is /partially/ true, if you start a single machine with ha disabled it will read in the storage and give you functional access to the data, however the core/raft/tls
key is still persisted. Fundamentally my ask here is to provide a native function for completely wiping the data used for HA from the primary datastore such that HA can be reinitialized without the ghosts of the past interfering.
I have not attempted the manual recovery as documented in your link. I will give it a try, but my original ask still seems valid. Which I'll restate once more, state from the ha configuration is persisted in the primary data store which prevents reuse of that data store in a different ha configuration.
Yeah, that's fair. Out of curiosity, why are you using mysql+raft rather than the native mysql HA support?
We found that the mysql HA to be generally not as reliable.
Then my question is: why not use raft exclusively, rather than have two storage subsystems to maintain, one only community-supported?
It is a valid question, but really an aside to the issue at hand. We have considerable experience with and infrastructure to support mysql as a primary data store. Native raft as the primary backend is appealing in some ways, but is just not the architectural decision we've made at the moment.
hey @drawks - on a side note since I do see a lot of contributions from you here - I was wondering if you've ever done a performance comparison (speed, memory / data sizes and overall CPU use) of integrated storage vs MySQL (would be interesting to see that separate to this).
Specific to the ask I'm curious in these scenarios, can the less initiated follow the Vault Cluster Lost Quorum Recovery steps in order to recovery any remaining node (typically opt for last known leader) and thereafter rescale / rejoin the other nodes anew?
Obviously the sys/raw
approach with surgery of internal in recovery, as highlighted earlier, is always an option.
Is your feature request related to a problem? Please describe. I recently conducted an exercise of doing a disaster recovery of a vault cluster (mysql storage backend, raft ha_storage) and discovered that if all of the machine local raft state is missing from all nodes (
raft.db
andvault.db
) AND none of the nodes was a participant in the cluster previously that no node can be elected active. This makes sense since there is no raft cluster assembled in which to conduct the vote. However, the current bootstrap api (sys/storage/raft/bootstrap
) will refuse to bootstrap a raft cluster because of the existence of thecore/raft/tls
key in the storage backend. Attempts to call the bootstrap API return an error like:However if that key is manually removed from the storage backend, for instance by deleting it from mysql in my case, the bootstrap API once again becomes available and you can reassemble the raft cluster.
Describe the solution you'd like Ideally I'd like to not have to do "brain surgery" on the storage backend during a disaster recovery exercise. If the
sys/storage/raft/bootstrap
api could take an argument that would instruct it to destructively re-bootstrap in spite of evidence of an existing cluster it would be much cleaner.Describe alternatives you've considered AFAICT the only alternative is to manually edit the storage backend.
Explain any additional use-cases If the number of nodes in a cluster drops below the threshold to form a quorum this feature would also allow forcing a new master such that the members of the raft could be modified. Currently you cannot add or remove peers to the raft if there is no active leader.