hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
31.07k stars 4.2k forks source link

GCP KMS Autounseal Error bug #17850

Closed KosShutenko closed 1 year ago

KosShutenko commented 1 year ago

Describe the bug I have installed Vault clusters from chart v 0.22.0 into few Kubernetes clusters v1.23.8-gke.1900 with auto-unseal via GCP KMS. After upgrade Vault chart to version 0.22.1 I've got error:

Error parsing Seal configuration: error checking key existence: rpc error: code = PermissionDenied desc = Permission 'cloudkms.cryptoKeys.get' denied on resource 'projects/MY_PROJECT_NAME/locations/europe-west2/keyRings/vault-unseal-ring/cryptoKeys/vault-unseal-key' (or it may not exist).

After downgrade to 0.22.0 issue was solved. But now I cannot have up-to-date Vault clusters.

To Reproduce Steps to reproduce the behavior:

  1. Install Vault cluster from chart v 0.22.0
  2. Configure auto-unseal via GCP KMS
  3. Upgrade chart version from 0.22.0 to 0.22.1 and wait until statefulset will be upgraded.
  4. Remove one pod to recreate it and check logs.

Expected behavior All pods should be upgraded and started from new version and with auto-unseal via GCP KMS.

tvoran commented 1 year ago

Hi @KosShutenko, are you specifying the vault server version in your chart's values, or is it using the default for the chart version? It sounds like this may be an issue between Vault 1.11.3 and 1.12.0, but wanted to make sure.

KosShutenko commented 1 year ago

Hi @KosShutenko, are you specifying the vault server version in your chart's values, or is it using the default for the chart version? It sounds like this may be an issue between Vault 1.11.3 and 1.12.0, but wanted to make sure.

Hello, No, I didn't set vault server version. I even didn't use chart version before. Only when I got error, I've set chart version to downgrade it.

tvoran commented 1 year ago

Gotcha. Since the default Vault version in chart 0.22.0 was 1.11.3, and for chart 0.22.1 was 1.12.0, this sounds like it may be an issue with 1.12. I'm going to transfer this issue over to the main vault repository. It would be helpful if you could include the vault config you're using.

Vault 1.12.1 has also been recently released, so you may want to try it and see if there's any difference. We recommend setting an explicit vault version in the chart with the server.image settings so unexpected upgrades don't occur.

There are also some similar reports on https://github.com/hashicorp/vault/issues/17527

jawnsy commented 1 year ago

I ran into the same problem when upgrading Vault using the Helm chart, from version 0.21.0 to 0.23.0 of the chart. The solution is to provide the Cloud KMS Viewer role (in addition to Cloud KMS CryptoKey Encrypter/Decrypter) to the service account.

I suspect that the issue is that a check was added somewhere to check for the key existence, whereas Vault previously just tried to use the key.

You can replicate Vault's behavior by creating a pod using the same service account using the google/cloud-sdk image, and then running:

# gcloud kms keys describe ghi --keyring=def --location=global --project=abc
ERROR: (gcloud.kms.keys.describe) PERMISSION_DENIED: Permission 'cloudkms.cryptoKeys.get' denied on resource 'projects/abc/locations/global/keyRings/def/cryptoKeys/ghi' (or it may not exist).

Once you grant the KMS viewer role, you should see output like this:

# gcloud kms keys describe ghi --keyring=def --location=global --project=abc
createTime: '2019-12-10T20:42:09.838170389Z'
destroyScheduledDuration: 86400s
name: projects/abc/locations/global/keyRings/def/cryptoKeys/ghi
nextRotationTime: '2022-12-04T04:13:44.120438Z'
primary:
  algorithm: GOOGLE_SYMMETRIC_ENCRYPTION
  createTime: '2022-12-03T00:27:04.120438038Z'
  generateTime: '2022-12-03T00:27:04.120438038Z'
  name: projects/abc/locations/global/keyRings/def/cryptoKeys/ghi/cryptoKeyVersions/3
  protectionLevel: SOFTWARE
  state: ENABLED
purpose: ENCRYPT_DECRYPT
rotationPeriod: 100000s
versionTemplate:
  algorithm: GOOGLE_SYMMETRIC_ENCRYPTION
  protectionLevel: SOFTWARE
dhess commented 1 year ago

I can confirm that we've also run into this issue on GKE when upgrading the chart from 0.22.0 to 0.23.0, and overriding the Vault version to 1.12.2.

stevendpclark commented 1 year ago

Hello all,

@jawnsy has the correct answer to the issue at hand. Within the 1.12 development cycle we tweaked the ordering of some calls which moved up a cloudkms.cryptoKeys.get request during the initial setup phase for GCP KMS keys. We now validate the key exists with a get request as we can use keys for either encryption or signing (through managed keys), so the existing encryption only test was insufficient and would fail for keys designated as signing only.