Azure / terraform-azurerm-aks

Terraform Module for deploying an AKS cluster
MIT License
359 stars 469 forks source link

KMS using KeyVault fails with identity encrypt/decrypt permission and public network access #390

Closed jckeme-rs closed 8 months ago

jckeme-rs commented 1 year ago

Is there an existing issue for this?

Greenfield/Brownfield provisioning

greenfield

Terraform Version

1.4.6

Module Version

3.57.0

AzureRM Provider Version

3.57.0

Affected Resource(s)/Data Source(s)

azurerm_kubernetes_cluster

Terraform Configuration Files

resource "azurerm_key_vault_key" "aks_kms_key" {
  name         = join("-", [var.aks_cluster_name, "kms-key"])
  key_vault_id = var.aks_keyvault_id

  key_type = "RSA"

  key_size = 4096

  key_opts = [
    "decrypt",
    "encrypt",
    "unwrapKey",
    "wrapKey",
    "sign",
    "verify"
  ]

  tags = merge(var.common_tags, var.custom_tags)

}

resource "azurerm_kubernetes_cluster" "aks_cluster" {
  name                          = var.aks_cluster_name
  location                      = var.az_region
  resource_group_name           = var.rsg_name
  kubernetes_version            = var.kubernetes_version
  dns_prefix_private_cluster    = var.dns_prefix_private_cluster
...

  key_management_service {
    key_vault_key_id         = azurerm_key_vault_key.aks_kms_key.id
    key_vault_network_access = "Public"
  }
}

tfvars variables values

tfvars

Debug Output/Panic Output

-

Expected Behaviour

AKS KMS should have been enabled with the keyvaults

Actual Behaviour

The following error was received Code="AzureKeyVaultKmsValidateIdentityPermissionCustomerError" Message="The identity does not have keys encrypt/decrypt permission on key vault

Steps to Reproduce

No response

Important Factoids

No response

References

Issue was described in more detail here: https://github.com/MicrosoftDocs/azure-docs/issues/98954

Looking at the KMS example in this repo, it does appear that the identity permission given should be "Key Vault Contributor". However the documentation says "Key Vault Crypto User".

https://github.com/Azure/terraform-azurerm-aks/blob/main/examples/named_cluster/kms.tf

zioproto commented 1 year ago

@jckeme-rs I understand you are trying to Add Key Management Service (KMS) etcd encryption to an Azure Kubernetes Service (AKS) cluster

I understand you are not using our AKS Terraform module, but you are looking at the examples folder to debug an issue when you are using the azurerm_kubernetes_cluster resource directly.

KMS etcd encryption doesn't work with system-assigned managed identity. Please can you confirm you are using a User Assigned managed identity ? Because in your minimal example to reproduce the problem I don't see the creation of the User Assigned managed identity.

Thank you

jckeme-rs commented 1 year ago

Hi @zioproto , apologies for the delayed feedback. Yes, for the AKS cluster, 2 User Assigned Managed Identities are being used. One is for the kubelets, and the other is for the aks control plane.

This github issue describes the error that was experienced precisely: https://github.com/MicrosoftDocs/azure-docs/issues/98954

For the KMS feature using the RBAC KeyVault, the role that was assigned was "Key Vault Crypto User" and the same error was received.

Code="AzureKeyVaultKmsValidateIdentityPermissionCustomerError" Message="The identity does not have keys encrypt/decrypt permission on key vault

Is it possible to validate what role needs to be assigned to the AKS Control Plane User Assigned Managed Identity towards the Private Azure Key Vault which will enable KMS work via private endpoints?

bhperry commented 9 months ago

Running into this problem as well, figured out a workaround is to set public_network_access_enabled = true on the vault in terraform, even though that is the default value.

explanation here: https://github.com/MicrosoftDocs/azure-docs/issues/98954#issuecomment-1933135303

lonegunmanb commented 8 months ago

Special thanks to @bhperry, this issue doesn't seem like a bug in this module so I'm closing it, please feel free to reopen it if you have different idea @jckeme-rs .

@bhperry Have you tried KeyVault with public_network_access_enabled = false and a private endpoint to the KeyVault? Just curious whether we can fix the issue by creating a private endpoint in vnet.

bhperry commented 8 months ago

Have not gotten around to testing private access yet, but it should work. As far as I can tell this is a bug on the azure backend when it has to assign a default value for public access.

oc159 commented 1 month ago

Special thanks to @bhperry, this issue doesn't seem like a bug in this module so I'm closing it, please feel free to reopen it if you have different idea @jckeme-rs .

@bhperry Have you tried KeyVault with public_network_access_enabled = false and a private endpoint to the KeyVault? Just curious whether we can fix the issue by creating a private endpoint in vnet.

Tested with a Private Endpoint and didn't work unfortunately. Set KeyVault to public_network_access_enabled = false in the tf deployment and was still causing issues. This was with Azure RBAC, User Managed Identity and RBAC Role: Azure KeyVault Crypto Officer assigned