hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
31.18k stars 4.21k forks source link

Feature Request: Support multiple KMS Keys for Auto Unseal #6046

Closed justyns closed 7 months ago

justyns commented 5 years ago

Is your feature request related to a problem? Please describe.

We need to plan for a scenario where someone accidentally deletes a KMS key, or KMS itself is inaccessible in a region. I've tried the following as a simple test:

This fails because the KMS key is disabled/deleted/inaccessible from the new cluster.

Essentially I want the ability to take a consul snapshot and restore it on a server that does not have access to the KMS key used by the vault cluster with auto-unseal using awskms enabled.

Describe the solution you'd like

I think supporting multiple seal blocks or kms ids in the awskms seal block may be the simplest solution. This could look something like this:

seal "awskms" {
  region     = "us-east-1"
  kms_key_ids = [
       "region1/19ec80b0-dfdd-4d97-8164-c6examplekey",
       "region2/12e4e120-dfdd-4d97-8164-c6examplekey2"
  ]
}

In my example, I would specify KMS keys from two (or more) regions. Vault would then encrypt the master key against every KMS key in the list.

I would then be able to take a consul snapshot in region1, create a consul/vault cluster in region2, restore the snapshot, bring up the vault cluster and have it auto unseal as normal.

If supporting multiple kms keys wouldn't work for some reason, then I think allowing the original unseal keys (recovery keys) to unseal a vault cluster would also be a reasonable solution.

Describe alternatives you've considered

Explain any additional use-cases

Without this feature, it is impossible afaict to properly backup a Vault cluster with Auto Unseal enabled. Even with Enterprise DR and replication, you wouldn't have a true backup as you can't back up the KMS keys.

Please let me know if I am misunderstanding something or if there is an alternative solution. Thanks!

chrishoffman commented 5 years ago

Why can't you point your region2 at your region1 KMS keys? There are certain things that make this unlikely to implement, such as key rotation and Seal Wrap (in enterprise).

jefferai commented 5 years ago

Even with Enterprise DR and replication, you wouldn't have a true backup as you can't back up the KMS keys.

Enterprise DR can use a different key per cluster, which does address this issue (and in fact is best practice).

justyns commented 5 years ago

Why can't you point your region2 at your region1 KMS keys? There are certain things that make this unlikely to implement, such as key rotation and Seal Wrap (in enterprise).

The idea was to plan for a DR event that could include the region1 KMS keys being unavailable for whatever reason.

Even with Enterprise DR and replication, you wouldn't have a true backup as you can't back up the KMS keys.

Enterprise DR can use a different key per cluster, which does address this issue (and in fact is best practice).

Using Enterprise DR, is there a way to take a backup of the Vault backend data and restore it somewhere that does not have access to KMS or whichever auto-unseal method is being used? My understanding is that this isn't possible and is what I meant by that statement. You're right though that having an Enterprise DR cluster would address this issue for most companies using Enterprise Vault.

I would still like to see a way of addressing this for the OSS version, or even those using Enterprise but not willing/able to run a DR cluster.

xynova commented 5 years ago

+1

rohit8925 commented 5 years ago

+1

rohit8925 commented 5 years ago

I am also trying to achieve similar objective, where i enabled backup for vault-backend mysql in Region-1 and tried restoring on another mysql instance on Region-2. If for some reason region-1 is down and KMS Key-id is unavailable from Region-1, auto-unseal won't help, and in that case how to unseal Vault, so as to restore services. Also, I want to understand the usage of Recovery-keys. I agree with @justyns , it would be good to have multiple KMS keys for Auto unseal.

chris-ng-scmp commented 5 years ago

+1

CpuID commented 5 years ago

Hit this exact same thought process today:

Originally I was under the impression that the "recovery keys" could be used to unseal a vault cluster that had auto unseal enabled. This isn't the case, and makes their name seem misleading unless they serve a purpose other than generating root tokens or removing auto unseal (while auto unseal is still enabled and working.)

I was under the exact same impression, then I went to test it and hit the "invalid key" error trying to unseal with the recovery keys. I then found https://groups.google.com/forum/#!msg/vault-tool/-gdDm-KRlxw/4b6t0QnaAgAJ which confirmed a suspicion I had after my testing, which led me here.

  • Another option considered was to import custom key material into a new KMS key. Because the old encrypted data references the original KMS key id, this doesn't work when that key no longer exists.

I considered this idea, interesting I didn't realise it references the KMS Key ID directly, what a PITA... so you couldn't just import the same key material into a new key in either the same or a different AWS region, and have it unseal I guess...?

CpuID commented 5 years ago

@ncabatoff as a followup to comments in https://github.com/hashicorp/vault/pull/7559 - you are right, would need to store the master key encrypted multiple times, in KMS + somewhere else potentially.

personally I'd prefer a way to use unseal shamir shares to do it as one method for recovery, to allow for "break glass" type situations where you take a last known good backup from your storage backend, load it up on locally (laptop or whatever) and unseal it to obtain whatever is required.

using multiple KMS keys is one approach, would at least cover off a single AWS region going away. using shamir shares also would handle the situation of AWS dropping off entirely (global network misconfiguration etc), where you can't get to KMS anywhere.

tecnobrat commented 4 years ago

Right now we believe our workaround is:

We now have a consul backup that is not sealed using KMS and can be unsealed with the recovery keys.

Then we:

Yea, its a lot of work, but I don't know a current workaround otherwise. We simply cannot have our backends KMS sealed or we cannot unseal them in the event of a KMS or regional outage.

CpuID commented 4 years ago

@vishalnayak @calvn @jefferai based on community feedback so far above, any chance of this getting some cycles in the near future? :) thanks!

perorope commented 4 years ago

Right now we believe our workaround is:

  • backup our consul
  • restore it
  • set seal.awskms.disabled: true
  • launch vault, connected to restored consul
  • run vault operator unseal -migrate -- multiple times, passing in the recovery keys
  • take another backup of consul

We now have a consul backup that is not sealed using KMS and can be unsealed with the recovery keys.

Then we:

  • Restore consul in new region
  • Spin up vault without awskms disabled
  • Run vault operator unseal -migrate again, passing in the recovery keys again.
  • Restart vault now that KMS unsealing is back on.

Yea, its a lot of work, but I don't know a current workaround otherwise. We simply cannot have our backends KMS sealed or we cannot unseal them in the event of a KMS or regional outage.

@tecnobrat I tried procedure you sent but getting

techs07 commented 4 years ago

As far as I know, 'vault operator unseal -migrate' will work till the time KMS is accessible. If you disconnect KMS, -migrate won't work. As vault need access to active KMS even to migrate to new. At least there should be a provision to provide unseal keys when KMS is not available. In Kubernetes, Pods won't even start if there is any issue with KMS.

stevenscg commented 4 years ago

While multiple keys would be useful and worth implementing, I too would like to see recovery keys work for unseal if the configured auto-unseal mechanism is not available in an emergency.

personally I'd prefer a way to use unseal shamir shares to do it as one method for recovery, to allow for "break glass" type situations where you take a last known good backup from your storage backend, load it up on locally (laptop or whatever) and unseal it to obtain whatever is required.

yevgeniyo-ps commented 4 years ago

I have created just new KMS key and stored it in configuration. Vault unseal like a charm.... My backend is MySQL Vault 1.3.7

glavoie commented 4 years ago

I also think we should have a way to recover Vault OSS using the recovery Shamir keys. I find this is not clear from the documentation that they can't be used anymore for unsealing as they are called recovery keys.

We found out about this limitation after a backup recovery firedrill in a clean/segregated cluster where Vault expected to to interact with the production KMS Key.

In case of any issue with KMS, it means we have no way to recover neither the cluster, nor any of its backup, which could be catastrophic. We could work around this by using a user provided KMS key, but this also brings its own set of problems about where to store the original key, who should have access to it and incurring downtime when we need to rotate it (KMS migrate -> Shamir migrate -> KMS).

akurz commented 3 years ago

I have created just new KMS key and stored it in configuration. Vault unseal like a charm.... My backend is MySQL Vault 1.3.7

@yevgeniyo can you elaborate on this? I tried a restore test and created another AWS KMS key with external key material (so with the same secret) but was unable to unseal Vault with it. I updated the key id and access/secret key in the configuration but it's not working as expected :-(

[WARN] failed to unseal core: error="fetching stored unseal keys failed: failed to encrypt keys for storage: error decrypting data encryption key: AccessDeniedException: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access.

.... testing with the AWS commandline whether the credentials are working looks ok. Any hint?

zerkms commented 3 years ago

We could work around this by using a user provided KMS key

@glavoie that was my original assumptions, but no: a different key with the same user provided key material IS NOT capable of decrypting aws kms encrypted secrets. Only exactly the same CMK (the key with the same key id) can do that. In other words: there is absolutely no workaround for restoring from backup a vault instance that used aws kms autoseal. If you lose aws kms key (through deletion or other matter) - then there is absolutely no way to restore from backup, even if you still have user provided key material. Theoretically, one could have reverse engineered (good luck?) the encryption mechanism AWS KMS uses, but meh.

I'm here from https://discuss.hashicorp.com/t/switching-to-different-aws-kms-key-id-with-the-same-key-material/19116/22 and it was my original assumption as well.

tmiroslav commented 3 years ago

Hi,

In Vault enterprise DR solution, when KMS key is used for auto-unseal, does same KMS key is used for Vault backend data encryption in both DR sites? If yes, Is there possibility to set different keys for different sites? So, if there is a problem with KMS key, that it's still possible to unseal backend on secondary site.

ahjohannessen commented 3 years ago

Any news on this? Ability to being able to have Shamir as fallback when the KMS key gets destroyed seems like a paramount capability when things go south.

jlj77 commented 3 years ago

... to have Shamir as fallback when the KMS key gets destroyed seems like a paramount capability...

Please refrain from profanity. Of course it's important: that's why Enterprise customers have a number of options available.

ahjohannessen commented 3 years ago

Please refrain from profanity.

Sure, sorry about that.

Of course it's important: that's why Enterprise customers have a number of options available.

Hopefully something like support for Shamir fallback is considered regardless of OSS or Enterprise.

yermulnik commented 3 years ago

@jlj77

Of course it's important: that's why Enterprise customers have a number of options available.

Would you mind elaborating a bit more about Enterprise options re auto-unseal when original AWS KMS key is not accessible?

Also maybe some can give a clue re how do we restore from the snapshot on another cluster which used its own AWS KMS to unseal? The issue is that when restoring from snapshot Vault attempts to unseal using AWS KMS key of the original cluster (the one that the snapshot was created from) and obviously fails since this AWS KMS key is not accessible.

jlj77 commented 3 years ago

Would you mind elaborating a bit more about Enterprise options re auto-unseal when original AWS KMS key is not accessible?

My apologies. I didn't mean to suggest that Enterprise customers had other options related to this specific scenario; only that DR obviates the need for this in many cases.

... The issue is that when restoring from snapshot Vault attempts to unseal using AWS KMS key of the original cluster...

I'm assuming that snapshot-force is no help in this scenario, yes?

yermulnik commented 3 years ago

I'm assuming that snapshot-force is no help in this scenario, yes?

Yep, it just bypasses checks and ignores the warning.

klebediev commented 3 years ago

So far I'm reading all this like: "Do NOT use KMS Auto Unseal with Vault OSS unless you are fine with not having ability to backup your cluster (which in fact makes it useless in production environments)"

The same as some other participants of this issue I've gone trough this path of try-and-fail during DR testing like:

As I wasn't able to find cautions in the docs such things aren't gonna work + terms like "recovery keys" is kinda misleading in this situation + it's not obvious that KMS key id can't be changed even if we recreate it with the same key material, maybe it's worth putting such cautions into Vault docs?

I'm not a cybersecurity expert, might anybody explain what's the point in "hardcoding" KMS key id when in fact we just need key material to decrypt master key (if I understand correctly how it works)?

zerkms commented 3 years ago

what's the point in "hardcoding" KMS key id when in fact we just need key material to decrypt master key (if I understand correctly how it works)?

It's not done by Vault, it's a native AWS Envelope Encryption, so hashicorp engineers just use primitives provided by the AWS KMS.

klebediev commented 3 years ago

Another problem: Vault isn't able to detect that KMS key deletion is scheduled (and during this period which is from 7 - to 30 days in case of AWS KMS we may cancel deletion) unless it restarts. So, if key deletion is scheduled but vault doesn't restart for 30 days => next restart we'll get surprise.

This might be helpful if Vault continuously checks whether the key is available and responds somehow in case of unavailability ranging from reflecting this in some metrics value (/v1/sys/metrics) to sealing the node (this can be added as an option to seal "awskms" block; personally I'd prefer the latter)

klebediev commented 3 years ago

It's not done by Vault, it's a native AWS Envelope Encryption, so hashicorp engineers just use primitives provided by the AWS KMS.

thanks @zerkms ! A little correction: as far as I understand in context of KMS it's called cyphertext encryption, not envelope encryption. An illustration why a key with different id but with the same key material can't be used for decrypting secret:

$ printf "myMasterPassword123" > masterkey
$ aws kms encrypt \
>     --key-id alias/abc \
>     --plaintext fileb://masterkey \
>     --output text \
>     --query CiphertextBlob | base64 \
>     --decode > masterkey.enc

$ aws kms decrypt \
>     --ciphertext-blob fileb://masterkey.enc \
>     --key-id alias/abc \
>     --output text \
>     --query Plaintext | base64 --decode
myMasterPassword123

$ aws kms decrypt \
>     --ciphertext-blob fileb://masterkey.enc \
>     --key-id alias/abc-restored \
>     --output text \
>     --query Plaintext | base64 --decode
An error occurred (IncorrectKeyException) when calling the Decrypt operation: The key ID in the request does not identify a CMK that can perform this operation.
klebediev commented 3 years ago

Next question: why KMS Auto-unseal recovery keys can't be used for emergency unsealing when KMS CMK isn't available?

StanislavPreply commented 3 years ago

What's the problem to make it possible to unseal vault with recovery keys, and then use vault operator unseal -migrate to migrate these keys to newly created KMS?

It's crazy that such an essential feature is not being addressed for a more than 2 years already, not even a comment from hashicorp team...

VireshDoshi commented 3 years ago

This has been bugging me for some time! I currently use the auto unseal feature with azure key vault. I have basically concluded that the auto unseal key can not be purged / deleted and that is that ! It sounds like using auto unseal is not good for production.

jefferai commented 3 years ago

@StanislavPreply there have been a number of comments from HashiCorp team members; however, I think that this is simply a feature request that is not currently scheduled so there isn't much new information to provide.

techmouse84 commented 3 years ago

https://aws.amazon.com/about-aws/whats-new/2021/06/kms-multi-region-keys/

Just came across this . I have tested this by performing the following.

  1. create primary MRK and replica MRK and grant user A to both keys.
  2. create vault with auto_unseal using primary MRK
  3. remove user A access from primary MRK
  4. restart vault and verify that it fails to unseal as user is denied access to key.
  5. update hcl config to use replica key region. ( key id is the same )
  6. restart vault and verify that it's able to unseal.

I think this fits most of the use case.

PauloGDPeixoto commented 2 years ago

https://aws.amazon.com/about-aws/whats-new/2021/06/kms-multi-region-keys/

Just came across this . I have tested this by performing the following.

1. create primary MRK and replica MRK and grant user A to both keys.

2. create vault with auto_unseal using primary MRK

3. remove user A access from primary MRK

4. restart vault and verify that it fails to unseal as user is denied access to key.

5. update hcl config to use replica key region. ( key id is the same )

6. restart vault and verify that it's able to unseal.

I think this fits most of the use case.

It doesn't help if you lose access to the AWS account where you have the KMS key.

akpysec commented 2 years ago

Hello,

I've ran into the same issue with the DR deployment in different Region, have any of you guys have tried "Multi-Region key" option in the KMS, as a solution? It's basically allows replicate the key between the regions, that way you can keep the same vault seal configuration (in theory)..

techmouse84 commented 2 years ago

Hello,

I've ran into the same issue with the DR deployment in different Region, have any of you guys have tried "Multi-Region key" option in the KMS, as a solution? It's basically allows replicate the key between the regions, that way you can keep the same vault seal configuration (in theory)..

Yes , it's tested and proven. I'm using it in production.

akpysec commented 2 years ago

@techmouse84 Awesome, thanks!

glavoie commented 2 years ago

@akpysec We've also tested MRKs successfully with a recovery in a different AWS regions from a snapshot from our main regions.

erulabs commented 1 year ago

As a workaround for "what if the KMS store is deleted and I lose my entire vault", an "EXTERNAL" KMS store seems to work properly.

Create a new AWS KMS "External" key (in this example, using SHA256):

openssl rand -out PlaintextKeyMaterial.bin 32
openssl pkeyutl \
  -in PlaintextKeyMaterial.bin \
  -out EncryptedKeyMaterial.bin \
  -inkey KMS_WRAPPING_KEY \
  -keyform DER \
  -pubin -encrypt -pkeyopt rsa_padding_mode:oaep -pkeyopt rsa_oaep_md:sha256

Upload the key and use this as your kms_key_id. You can now create multiple KMS keys (or write a disaster recovery script) and simply stop vault, swap kms_key_id, restart vault, and you're unsealed and good to go. Of course... now you need somewhere safe to store your PlaintextKeyMaterial.bin ♻️

karras commented 1 year ago

See also https://github.com/hashicorp/vault/pull/18683, which looks exciting!

glavoie commented 1 year ago

@karras the change was just reverted: https://github.com/hashicorp/vault/commit/8b2b93c6e4625c5b3ba5870a3642669f6f3f7571

:(

NagenderPulluri commented 1 year ago

I have vault deployed in region1 and for auto unseal I used AWS KMS key(single region specific), as part of DR i'm restoring it in region2, but couldn't able to unseal the vault as the KMS key used not exists in region2, how to resolve this ?

yevgeniyo-ps commented 1 year ago

my guess you have used default kms key that could not be copied between regions. Create own custom kms key.

NagenderPulluri commented 1 year ago

@yevgeniyo no.. i used AWS CMK key only.. but thats a single region key.. can't share to other regions it seems..

heatherezell commented 7 months ago

Hello! This has been addressed with Seal HA for Enterprise. If this does not address a specific use case, please feel free to open a new enhancement request. Thanks!