hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.61k stars 4.65k forks source link

[ISSUE] Issue with azurerm_databricks_workspace resource and CMK for encryption cross subscriptions #24883

Closed yessawab closed 7 months ago

yessawab commented 9 months ago

Is there an existing issue for this?

Community Note

Terraform Version

1.7.3

AzureRM Provider Version

3.91.0

Affected Resource(s)/Data Source(s)

azurerm_databricks_workspace

Terraform Configuration Files

provider "azurerm" {
  features {}
  skip_provider_registration = true
  subscription_id            = ""
}

provider "azurerm" {
  features {}
  alias           = "management"
  subscription_id = ""
}

data "azurerm_key_vault" "existing" {
  provider            = azurerm.management
  name                = "mngt-kvault1"
  resource_group_name = "mngt"
}

data "azurerm_key_vault_key" "example" {
  provider            = azurerm.management
  name         = "databricks-cmk"
  key_vault_id = data.azurerm_key_vault.existing.id
}

resource "azurerm_databricks_workspace" "this" {
  name                                                = var.databricks_workspace_name
  resource_group_name                                 = var.spoke_resource_group_name
  managed_resource_group_name                         = local.managed_resource_group_name
  location                                            = var.location
  customer_managed_key_enabled                        = "true"
  managed_services_cmk_key_vault_key_id               = data.azurerm_key_vault_key.example.id
  managed_disk_cmk_key_vault_key_id                   = data.azurerm_key_vault_key.example.id
  managed_disk_cmk_rotation_to_latest_version_enabled = "true"
  sku                                                 = "premium"

  custom_parameters {
    no_public_ip                                         = true
    virtual_network_id                                   = azurerm_virtual_network.this.id
    private_subnet_name                                  = azurerm_subnet.private.name
    public_subnet_name                                   = azurerm_subnet.public.name
    private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
    public_subnet_network_security_group_association_id  = azurerm_subnet_network_security_group_association.public.id
  }
}

Debug Output/Panic Output

2024-02-14T17:13:13.062+0800 [ERROR] vertex "module.adb-lakehouse.azurerm_databricks_workspace.this" error: retrieving the Resource ID for the customer-managed keys for managed services Key Vault at URL "https://mngt-kvault1.vault.azure.net/": <nil>
╷
│ Error: retrieving the Resource ID for the customer-managed keys for managed services Key Vault at URL "https://mngt-kvault1.vault.azure.net/": <nil>
│ 
│   with module.adb-lakehouse.azurerm_databricks_workspace.this,
│   on ../../modules/adb-lakehouse/vnet_injected_databricks_workspace.tf line 67, in resource "azurerm_databricks_workspace" "this":
│   67: resource "azurerm_databricks_workspace" "this" {
│

Expected Behaviour

When deploying a new Azure Databricks workspace and using encryption with a Customer Managed Key (CMK) sourced from a Key Vault in a distinct Azure subscription, the workspace must be instantiated utilizing said key for encryption. Same behaviour should apply when configuring CMK to an existing Azure Databricks workspace.

Actual Behaviour

During the execution of "terraform apply," the AzureRM provider fails in accessing the key from the other subscription. We experimented with both passing the key ID as a variable and utilising a Terraform Data Source, but neither method proved successful. Similar issue

Steps to Reproduce

Using the adb-lakehouse Terraform example and adding the CMK arguments to azurerm_databricks_workspace resource.

Important Factoids

No response

References

Similar issue

favoretti commented 9 months ago

Hi there and thank you for reporting this issue. As discussed offline - in this case you need to pass properly configured provider reference to both azurerm_key_vault and azurerm_key_vault_key datasources. That would potentially fix the issue at hand.

yessawab commented 9 months ago

Hi @favoretti, that's exactly how I configure my Terraform code. I add provider = azurerm.management to both datasources but still facing the same issue.

MaguellSandifort commented 9 months ago

We ran into the same issue. Looks like the databricks_workspace does some form of look-up on the keyvault with the provider configuration of the workspace, which has no access to the keyvault, resulting in an error.

For now we worked around it by deploying the workspace without encryption. Then apply the CMK configuration with an azapi_update_resource block:

# Configured for the subscription the workspace will be deployed to
provider "azurerm" {
  features {}
  skip_provider_registration = true
  subscription_id            = ""
}

# Configured for the subscription the key vault is located in
provider "azurerm" {
  features {}
  alias           = "management"
  subscription_id = ""
}

data "azurerm_key_vault" "existing" {
  provider            = azurerm.management
  name                = var.vault-name
  resource_group_name = var.vault-rg-name
}

data "azurerm_key_vault_key" "example" {
  provider     = azurerm.management
  name         = var.cmk-name
  key_vault_id = data.azurerm_key_vault.existing.id
}

resource "azurerm_role_assignment" "kv_crypto_user_to_databricks_enterprise_application" {
  provider             = azurerm.management
  scope                = data.azurerm_key_vault.existing.id
  role_definition_name = "Key Vault Crypto Service Encryption User"
  principal_id         = local.databricks_enterprise_application_id
}

resource "azurerm_databricks_workspace" "this" {
  name                                                = var.databricks_workspace_name
  resource_group_name                                 = var.spoke_resource_group_name
  managed_resource_group_name                         = local.managed_resource_group_name
  location                                            = var.location
  sku                                                 = "premium"

  # This enables the managed identity on the managed storage account needed for DBFS encryption, so we leave this on
  customer_managed_key_enabled                        = true

  # Not setting these in the workspace resource:
  # managed_services_cmk_key_vault_key_id               = data.azurerm_key_vault_key.example.id
  # managed_disk_cmk_key_vault_key_id                   = data.azurerm_key_vault_key.example.id
  # managed_disk_cmk_rotation_to_latest_version_enabled = true

  custom_parameters {
    no_public_ip                                         = true
    virtual_network_id                                   = azurerm_virtual_network.this.id
    private_subnet_name                                  = azurerm_subnet.private.name
    public_subnet_name                                   = azurerm_subnet.public.name
    private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
    public_subnet_network_security_group_association_id  = azurerm_subnet_network_security_group_association.public.id
  }
}

# This needs to be done after the workspace deployment, but before applying the CMK to the DBFS root
resource "azurerm_role_assignment" "kv_crypto_user_to_dbfs_storage" {
  provider             = azurerm.management
  scope                = data.azurerm_key_vault.existing.id
  role_definition_name = "Key Vault Crypto Service Encryption User"
  principal_id         = azurerm_databricks_workspace.this.storage_account_identity[0].principal_id
  depends_on           = [ azurerm_databricks_workspace.this ]
}

# Apply CMK configuration for DBFS root, Managed Services & Managed DIsk
resource "azapi_update_resource" "dbw_cmk" {
  resource_id = azurerm_databricks_workspace.this.id
  type        = "Microsoft.Databricks/workspaces@2023-02-01"

  body = jsonencode({
    properties = {
      parameters = {
        encryption = {
          value = {
            keySource   = "Microsoft.Keyvault"
            KeyName     = var.cmk-name
            keyvaulturi = data.azurerm_key_vault.vault_uri
            keyversion  = data.azurerm_key_vault_key.version
          }
        }
      }
      encryption = {
        entities = {
          managedServices = {
            keySource = "Microsoft.Keyvault"
            keyVaultProperties = {
              keyName     = var.cmk-name
              keyVaultUri = data.azurerm_key_vault.vault_uri
              keyVersion  = data.azurerm_key_vault_key.version
            }
          }
          managedDisk = {
            keySource = "Microsoft.Keyvault",
            keyVaultProperties = {
              keyName     = var.cmk-name
              keyVaultUri = data.azurerm_key_vault.vault_uri
              keyVersion  = data.azurerm_key_vault_key.version
            }
            rotationToLatestKeyVersionEnabled = true
          }
        }
      }
    }
  })

  depends_on = [ azurerm_role_assignment.kv_crypto_user_to_dbfs_storage ]
}

# Read the workspace that was deployed earlier as a data source after the update is applied
data "azurerm_databricks_workspace" "this" {
  name                = var.databricks_workspace_name
  resource_group_name = var.spoke_resource_group_name
  depends_on          = [ azapi_update_resource.dbw_cmk ]
}

# This needs to be done after the workspace update, as the disk encryption set is not available before the CMK configuration is applied
resource "azurerm_role_assignment" "kv_crypto_user_to_disk_encryption_set" {
  provider             = azurerm.management
  scope                = data.azurerm_key_vault.existing.id
  role_definition_name = "Key Vault Crypto Service Encryption User"
  principal_id         = data.azurerm_databricks_workspace.this.managed_disk_identity[0].principal_id
}
favoretti commented 9 months ago

Since only the key.id is being passed to the workspace, code tries to make sure that keyVault exists and that's where it fails if the KV is in a different subscription.

I could potentially disable that check and just trust that it exists to see if this fixes the issue - would anyone be able to make a test run with a development provider version? If so - please reach out - I'll compile a version for your architecture to give it a try.

Thanks!

MaguellSandifort commented 8 months ago

Since only the key.id is being passed to the workspace, code tries to make sure that keyVault exists and that's where it fails if the KV is in a different subscription.

I could potentially disable that check and just trust that it exists to see if this fixes the issue - would anyone be able to make a test run with a development provider version? If so - please reach out - I'll compile a version for your architecture to give it a try.

Thanks!

Sounds promising! I can be your guinea pig and give it a go. Any specific information that you need?

favoretti commented 8 months ago

Since only the key.id is being passed to the workspace, code tries to make sure that keyVault exists and that's where it fails if the KV is in a different subscription.

I could potentially disable that check and just trust that it exists to see if this fixes the issue - would anyone be able to make a test run with a development provider version? If so - please reach out - I'll compile a version for your architecture to give it a try.

Thanks!

Sounds promising! I can be your guinea pig and give it a go. Any specific information that you need?

Nope, nothing specific needed. Drop me a message on provider slack? I'll be home in 2 hours and can build you a test version.

Thanks!

WodansSon commented 8 months ago

I have opened a PR which should fix this, you cannot use data sources without an aliased provider block because the data source resource code automatically assumes that the subscription is the current subscription that Terraform is executed from:

func dataSourceKeyVaultRead(d *pluginsdk.ResourceData, meta interface{}) error {
    client := meta.(*clients.Client).KeyVault.VaultsClient
    subscriptionId := meta.(*clients.Client).Account.SubscriptionId
    ctx, cancel := timeouts.ForRead(meta.(*clients.Client).StopContext, d)
    defer cancel()
github-actions[bot] commented 6 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.