Closed yessawab closed 7 months ago
Hi there and thank you for reporting this issue. As discussed offline - in this case you need to pass properly configured provider reference to both azurerm_key_vault
and azurerm_key_vault_key
datasources. That would potentially fix the issue at hand.
Hi @favoretti, that's exactly how I configure my Terraform code. I add provider = azurerm.management
to both datasources but still facing the same issue.
We ran into the same issue. Looks like the databricks_workspace
does some form of look-up on the keyvault with the provider configuration of the workspace, which has no access to the keyvault, resulting in an error.
For now we worked around it by deploying the workspace without encryption. Then apply the CMK configuration with an azapi_update_resource
block:
# Configured for the subscription the workspace will be deployed to
provider "azurerm" {
features {}
skip_provider_registration = true
subscription_id = ""
}
# Configured for the subscription the key vault is located in
provider "azurerm" {
features {}
alias = "management"
subscription_id = ""
}
data "azurerm_key_vault" "existing" {
provider = azurerm.management
name = var.vault-name
resource_group_name = var.vault-rg-name
}
data "azurerm_key_vault_key" "example" {
provider = azurerm.management
name = var.cmk-name
key_vault_id = data.azurerm_key_vault.existing.id
}
resource "azurerm_role_assignment" "kv_crypto_user_to_databricks_enterprise_application" {
provider = azurerm.management
scope = data.azurerm_key_vault.existing.id
role_definition_name = "Key Vault Crypto Service Encryption User"
principal_id = local.databricks_enterprise_application_id
}
resource "azurerm_databricks_workspace" "this" {
name = var.databricks_workspace_name
resource_group_name = var.spoke_resource_group_name
managed_resource_group_name = local.managed_resource_group_name
location = var.location
sku = "premium"
# This enables the managed identity on the managed storage account needed for DBFS encryption, so we leave this on
customer_managed_key_enabled = true
# Not setting these in the workspace resource:
# managed_services_cmk_key_vault_key_id = data.azurerm_key_vault_key.example.id
# managed_disk_cmk_key_vault_key_id = data.azurerm_key_vault_key.example.id
# managed_disk_cmk_rotation_to_latest_version_enabled = true
custom_parameters {
no_public_ip = true
virtual_network_id = azurerm_virtual_network.this.id
private_subnet_name = azurerm_subnet.private.name
public_subnet_name = azurerm_subnet.public.name
private_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.private.id
public_subnet_network_security_group_association_id = azurerm_subnet_network_security_group_association.public.id
}
}
# This needs to be done after the workspace deployment, but before applying the CMK to the DBFS root
resource "azurerm_role_assignment" "kv_crypto_user_to_dbfs_storage" {
provider = azurerm.management
scope = data.azurerm_key_vault.existing.id
role_definition_name = "Key Vault Crypto Service Encryption User"
principal_id = azurerm_databricks_workspace.this.storage_account_identity[0].principal_id
depends_on = [ azurerm_databricks_workspace.this ]
}
# Apply CMK configuration for DBFS root, Managed Services & Managed DIsk
resource "azapi_update_resource" "dbw_cmk" {
resource_id = azurerm_databricks_workspace.this.id
type = "Microsoft.Databricks/workspaces@2023-02-01"
body = jsonencode({
properties = {
parameters = {
encryption = {
value = {
keySource = "Microsoft.Keyvault"
KeyName = var.cmk-name
keyvaulturi = data.azurerm_key_vault.vault_uri
keyversion = data.azurerm_key_vault_key.version
}
}
}
encryption = {
entities = {
managedServices = {
keySource = "Microsoft.Keyvault"
keyVaultProperties = {
keyName = var.cmk-name
keyVaultUri = data.azurerm_key_vault.vault_uri
keyVersion = data.azurerm_key_vault_key.version
}
}
managedDisk = {
keySource = "Microsoft.Keyvault",
keyVaultProperties = {
keyName = var.cmk-name
keyVaultUri = data.azurerm_key_vault.vault_uri
keyVersion = data.azurerm_key_vault_key.version
}
rotationToLatestKeyVersionEnabled = true
}
}
}
}
})
depends_on = [ azurerm_role_assignment.kv_crypto_user_to_dbfs_storage ]
}
# Read the workspace that was deployed earlier as a data source after the update is applied
data "azurerm_databricks_workspace" "this" {
name = var.databricks_workspace_name
resource_group_name = var.spoke_resource_group_name
depends_on = [ azapi_update_resource.dbw_cmk ]
}
# This needs to be done after the workspace update, as the disk encryption set is not available before the CMK configuration is applied
resource "azurerm_role_assignment" "kv_crypto_user_to_disk_encryption_set" {
provider = azurerm.management
scope = data.azurerm_key_vault.existing.id
role_definition_name = "Key Vault Crypto Service Encryption User"
principal_id = data.azurerm_databricks_workspace.this.managed_disk_identity[0].principal_id
}
Since only the key.id
is being passed to the workspace, code tries to make sure that keyVault exists and that's where it fails if the KV is in a different subscription.
I could potentially disable that check and just trust that it exists to see if this fixes the issue - would anyone be able to make a test run with a development provider version? If so - please reach out - I'll compile a version for your architecture to give it a try.
Thanks!
Since only the
key.id
is being passed to the workspace, code tries to make sure that keyVault exists and that's where it fails if the KV is in a different subscription.I could potentially disable that check and just trust that it exists to see if this fixes the issue - would anyone be able to make a test run with a development provider version? If so - please reach out - I'll compile a version for your architecture to give it a try.
Thanks!
Sounds promising! I can be your guinea pig and give it a go. Any specific information that you need?
Since only the
key.id
is being passed to the workspace, code tries to make sure that keyVault exists and that's where it fails if the KV is in a different subscription.I could potentially disable that check and just trust that it exists to see if this fixes the issue - would anyone be able to make a test run with a development provider version? If so - please reach out - I'll compile a version for your architecture to give it a try.
Thanks!
Sounds promising! I can be your guinea pig and give it a go. Any specific information that you need?
Nope, nothing specific needed. Drop me a message on provider slack? I'll be home in 2 hours and can build you a test version.
Thanks!
I have opened a PR which should fix this, you cannot use data sources without an aliased provider block because the data source resource code automatically assumes that the subscription is the current subscription that Terraform is executed from:
func dataSourceKeyVaultRead(d *pluginsdk.ResourceData, meta interface{}) error {
client := meta.(*clients.Client).KeyVault.VaultsClient
subscriptionId := meta.(*clients.Client).Account.SubscriptionId
ctx, cancel := timeouts.ForRead(meta.(*clients.Client).StopContext, d)
defer cancel()
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Is there an existing issue for this?
Community Note
Terraform Version
1.7.3
AzureRM Provider Version
3.91.0
Affected Resource(s)/Data Source(s)
azurerm_databricks_workspace
Terraform Configuration Files
Debug Output/Panic Output
Expected Behaviour
When deploying a new Azure Databricks workspace and using encryption with a Customer Managed Key (CMK) sourced from a Key Vault in a distinct Azure subscription, the workspace must be instantiated utilizing said key for encryption. Same behaviour should apply when configuring CMK to an existing Azure Databricks workspace.
Actual Behaviour
During the execution of "terraform apply," the AzureRM provider fails in accessing the key from the other subscription. We experimented with both passing the key ID as a variable and utilising a Terraform Data Source, but neither method proved successful. Similar issue
Steps to Reproduce
Using the adb-lakehouse Terraform example and adding the CMK arguments to azurerm_databricks_workspace resource.
Important Factoids
No response
References
Similar issue