Open maxb opened 2 years ago
Thanks Max! I'm going to tag @yhyakuna and @taoism4504 here as well for thoughts on updating the documentation.
Interesting discussion here.
The auto-unseal provider (AWS KSM, HSM, etc.) does not intervene the Vault's operational tasks. So, in a case where you lost (or revoked) your initial root token, it requires a threshold number of recovery keys to re-generate a root token. (So, that may be where the recovery naming came from?)
The docs under the HSM explains better about the recovery key. We could move the description up under the auto-unseal section to better explain the purpose of recovery keys.
Personally, I feel a documentation only solution to this issue would not be going far enough, given the possibility of horrendous data loss if a Vault operator carries this misunderstanding through to production.
@vinodmu I'm brining this to your attention since this request isn't about docs but about the product.
So this means that i'm not able to restore a backup with this recovery keys on a different server if something bad happen to the production server, right? Is there any possible solution to get my backup accessible?
@dops-at Currently, with Vault as it is today, you need to either:
Be confident that you can access the remote auto-unseal KMS from the new replacement server
Or, stop using auto-unseal, use the regular Shamir seal where the shares of the unseal key really can decrypt the backup on their own in a disaster scenario
This is unfortunately a long-standing problem with Vault:
I still can't understand why HC won't update Vault to keep one copy of the root key wrapped with the auto-unseal key and another copy wrapped with the recovery key. Seems like that should be a basic disaster recovery feature that doesn't even really change the threat model (as the recovery key is already trusted enough to produce root tokens).
I have now realised that if the Enterprise "sealwrap" feature is enabled, just having the recovery keys being able to reconstruct the root key isn't good enough - since the definition of "sealwrap" is to have various important parts of Vault data directly coupled to the seal device's encryption.
It seems like this feature request would only be viable in non-sealwrap configurations.
It looks like this is now being worked on, in #18683 ! :-)
Implemented by https://github.com/hashicorp/vault/pull/18683
@sgmiller Please re-open since the implementing PR was reverted, thanks.
Reopening this, per Maxb's comment above.
I am not sure about the reasons for the revert of @sgmiller's implementation.
https://github.com/hashicorp/vault/pull/18942
Rollback due to discovered complexities with ENT interactions.
But does anybody know if the feature can be adjusted so that these "complexities with ENT interactions" are handled?
I've recently migrated to KMS unseal, and was dissapointed to find this issue after the fact. The "Recovery Key" nomenclature is very misleading giving it's not remotely possible to recover your cluster if the KMS key is lost. It would be great to see the docs updated.
in particular, some additional warnings about ensuring appropriate gaurds for the KMS key (multi-region / SCP's etc) would be great https://developer.hashicorp.com/vault/docs/configuration/seal/awskms
Also although this guide does note that recovery keys can't actually recover anything, it would be good to add additional mention in the seal migration section, as it's easy enough to skip past if you're looking for docs on seal migration.
When you configure Vault with Auto-unseal, you get Recovery Keys.
The name itself makes a strong implication that their purpose is to recover Vault when the auto-unseal is down.
But actually this turns out to be a false promise - they cannot!
Recovery keys are actually only useful to prove to the Vault software that a quorum of administrators are acting together, to authenticate operations which require that.
This is a pretty common question to come up over on discuss.hashicorp.com.
I propose that something needs to be done to fix this misleading use of terminology, which has a high risk of causing data loss to users.
I can imagine three possible ways to proceed:
1) Rename "Recovery keys" to "Administrator authentication keys" throughout the product and documentation.
2) Transition "Recovery keys" to genuinely enable recovery from a lost auto-unseal, by storing a second copy of the key material that is encrypted with the auto-unseal key, encrypted with the recovery key too.
3) Effectively, both: in a future version of Vault, support a user choice between "Auto-unseal with recovery keys" and "Auto-unseal with administrator authentication keys".