Issue with azure_databricks_cmk 🏷️ databricks

aparna-reji commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues

Community Note

Please vote on this issue by adding a :thumbsup: reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

1.4.5

AzureRM Provider Version

3.50.0

Affected Resource(s)/Data Source(s)

azurerm_databricks_workspace

Terraform Configuration Files

resource "azurerm_databricks_workspace" "databricks" {

  name                = "databricks"
  resource_group_name = var.resource_group_name
  location            = var.location
  sku                 = "premium"
  managed_resource_group_name = "databricks-rg"
  managed_services_cmk_key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_key.id
  managed_disk_cmk_key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_disk_key.id
  managed_disk_cmk_rotation_to_latest_version_enabled = true
  customer_managed_key_enabled = true
  infrastructure_encryption_enabled= true
  public_network_access_enabled = false
  network_security_group_rules_required = "NoAzureDatabricksRules"

  custom_parameters {
    no_public_ip        = true
    public_subnet_name  = data.azurerm_subnet.publichost_subnet.name
    private_subnet_name = data.azurerm_subnet.privatecontainer_subnet.name
    virtual_network_id  = data.azurerm_virtual_network.network.id
    storage_account_name = "storage"
    public_subnet_network_security_group_association_id  = data.azurerm_subnet.publichost_subnet.id
    private_subnet_network_security_group_association_id = data.azurerm_subnet.privatecontainer_subnet.id
  }

  lifecycle {
  ignore_changes = [custom_parameters]

  }

}

resource "azurerm_databricks_workspace_customer_managed_key" "dbfs_key" {

  key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_dbfs_key.id
  workspace_id     = azurerm_databricks_workspace.databricks.id

}

Debug Output/Panic Output

│ Error: updating Workspace (Subscription: 'XXXXXXXXXXXXXXXX'
│ Resource Group Name: 'XXXX-rg'
│ Workspace Name: 'databricks') Tags: performing Update: unexpected status 400 with error: ApplicationUpdateFail: Failed to update application: 'databricks', because patch resource group failure.

Expected Behaviour

The workspace should have modified properly with additional cmk's.

The resource group mentioned in the error is resource group where the workspace is deployed . And this is not the managed resource group.

I have found similar issue here : https://github.com/Azure/azure-cli/issues/22614 , but cannot find what is the fix applied.

When checked with Microsoft, they confirmed, as the issue occurs before the resources are being deployed indicates that the issue does not occur at the resource provision step and therefore is not related to either Azure or Databricks.

Can someone please let me know if there are some known fixes I can try from my end.

Actual Behaviour

The same code was working before adding the additional two cmk's for managed disk and DBFS. While trying to modify my databricks workspace (re-deploying via Terraform) with additional CMK’s (for DBFS and disk), I am hitting the error.

To find which cmk is giving the trouble, tried removing the extra configs added for managed disk and DBFS CMK's separately.

While trying to redeploy the same workspace after removing the additional code added for managed_disk cmk , getting the below error:

performing CreateOrUpdate: unexpected status 400 with error: DiskEncryptionPropertiesRequired: Existing Disk Encryption Properties must be specified in the PUT request.

While redeploying the same workspace after removing the additional code added for DBFS cmk , gives the same error from start

:│ Error: updating Workspace (Subscription: 'XXXXXXXXXXXXXXXX'
│ Resource Group Name: 'XXXX-rg'
│ Workspace Name: 'databricks') Tags: performing Update: unexpected status 400 with error: ApplicationUpdateFail: Failed to update application: 'databricks', because patch resource group failure.

Referred this ticket and redeploying the same workspace after setting set managed_disk_cmk_rotation_to_latest_version_enabled = false , this threw another error saying:

polling after create or update: internal-error: a polling status offailedshould be surfaced as a polling failed error

Steps to Reproduce

terraform apply

Important Factoids

No response

References

https://github.com/hashicorp/terraform-provider-azurerm/blob/main/website/docs/r/databricks_workspace.html.markdown https://github.com/hashicorp/terraform-provider-azurerm/issues/21487

aparna-reji commented 1 year ago

Hi Team , I have removed the workspace and retried again. A pattern I have noticed is that , when i set managed_disk_cmk_rotation_to_latest_version_enabled = false , the cmk creation is working fine on an existing databricks workspace. But when this is set to true , then i get the below error:polling after create or update: internal-error: a polling status of failed should be surfaced as a polling failed error

WodansSon commented 1 year ago

@aparna-reji, after reading the CLI issue it appears that the issue is with the RP and not the provider. I will reach out to the service team and see if I can get anymore information about the issue which caused this error. 🚀

msaranga commented 1 year ago

@aparna-reji : Can we have ARM Resource Id of Databricks Workspace. It will be in this format (/subscriptions/XXXXX/resourceGroups/XXXXX/providers/Microsoft.Databricks/workspaces/XXXXXX)

WodansSon commented 1 year ago

@aparna-reji:

In trying to track down and debug this error, I tracked down where the error is being returned from (see below code). It appears this is being caused by update call to propagate the tags to all of the connected resources. In you example above did you by any chance happen to update the tags too prior to running the apply?

    // Only call Update (e.g., PATCH) if it is not a new resource and the Tags have changed
    // this will cause the updated tags to be propagated to all of the connected
    // workspace resources.
    // TODO: can be removed once https://github.com/Azure/azure-sdk-for-go/issues/14571 is fixed
    if !d.IsNewResource() && d.HasChange("tags") {
        workspaceUpdate := workspaces.WorkspaceUpdate{
            Tags: expandedTags,
        }

        err := client.UpdateThenPoll(ctx, id, workspaceUpdate)
        if err != nil {
            return fmt.Errorf("updating %s Tags: %+v", id, err)
        }
    }

@msaranga

I think this is related to issue 14571. I checked the issue, and it is still open, so I am assuming we still need to implement the work-around as described in the above Terraform provider code.

aparna-reji commented 1 year ago

@WodansSon Did you mean if the tags were updated manually before apply ? - i don't think that has happened. But from my terraform plan for the run where i have set managed_disk_cmk_rotation_to_latest_version_enabled = false , i can see that , the tags is shown as will be updated in place and the actual tags are shown to be removed . Please see the same from terraform plan below : ~ tags = { - "tag1" = "val1" -> null - "tag2" = "val2" -> null - "tag3" = "val3" -> null } And when i go and check the redeployed databricks workspace, i could see that it has tags.

Also i believe this update call issue is happening for the error performing CreateOrUpdate: unexpected status 400 with error: DiskEncryptionPropertiesRequired: Existing Disk Encryption Properties must be specified in the PUT request. . But as i mentioned , when i removed the workspace and retried again, when i set managed_disk_cmk_rotation_to_latest_version_enabled = false , the cmk creation is working fine on an existing databricks workspace. But when this is set to true , then i get the below error: polling after create or update: internal-error: a polling status of failed should be surfaced as a polling failed error . May I know if this polling status failed error is also due to update call ?.

WodansSon commented 1 year ago

@aparna-reji, Yes, the polling error is coming back from the update call. In the provider we make the update call then we poll the LRO until it is complete. From the above it appears an error has happened while we were polling for completion of the update call.

WodansSon commented 1 year ago

@aparna-reji, I am not able to reproduce your issue locally except for the performing CreateOrUpdate: unexpected status 400 with error: DiskEncryptionPropertiesRequired: Existing Disk Encryption Properties must be specified in the PUT request error which I believe is by design because once you encrypt the workspace you cannot undue it without destroying it and recreating it again without the encryption.

Just for verification, here are the steps I took based off how I understood your repro case above. I created the workspace and all of the supporting resources like this, so this is the state I believe your workspace was in before you added the DBFS and Managed Disk CMK encryption keys:

Initial Databricks Workspace configuration

```hcl provider "azurerm" { features { resource_group { prevent_deletion_if_contains_resources = false } } } data "azurerm_client_config" "current" {} resource "azurerm_resource_group" "repro" { name = "repro-22394-resources" location = "East US 2" } resource "azurerm_subnet" "publichost_subnet" { name = "public-subnet" resource_group_name = azurerm_resource_group.repro.name virtual_network_name = azurerm_virtual_network.network.name address_prefixes = ["10.0.1.0/24"] delegation { name = "delegation" service_delegation { name = "Microsoft.Databricks/workspaces" actions = ["Microsoft.Network/virtualNetworks/subnets/join/action", "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] } } } resource "azurerm_subnet" "privatecontainer_subnet" { name = "private-subnet" resource_group_name = azurerm_resource_group.repro.name virtual_network_name = azurerm_virtual_network.network.name address_prefixes = ["10.0.2.0/24"] delegation { name = "delegation" service_delegation { name = "Microsoft.Databricks/workspaces" actions = ["Microsoft.Network/virtualNetworks/subnets/join/action", "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] } } } resource "azurerm_virtual_network" "network" { name = "databricks-network" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name address_space = ["10.0.0.0/16"] tags = { "SkipASMAzSecPack" = "true" "SkipASMAzSecPackAutoConfig" = "true" } } resource "azurerm_network_security_group" "public_nsg" { name = "public-nsg" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name } resource "azurerm_network_security_group" "private_nsg" { name = "private-nsg" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name } resource "azurerm_subnet_network_security_group_association" "public_nsg_assoc" { subnet_id = azurerm_subnet.publichost_subnet.id network_security_group_id = azurerm_network_security_group.public_nsg.id } resource "azurerm_subnet_network_security_group_association" "private_nsg_assoc" { subnet_id = azurerm_subnet.privatecontainer_subnet.id network_security_group_id = azurerm_network_security_group.private_nsg.id } resource "azurerm_databricks_workspace" "repro" { depends_on = [azurerm_key_vault_access_policy.notebook, azurerm_subnet_network_security_group_association.public_nsg_assoc, azurerm_subnet_network_security_group_association.private_nsg_assoc] name = "databricks-repro" resource_group_name = azurerm_resource_group.repro.name location = azurerm_resource_group.repro.location sku = "premium" managed_resource_group_name = join("-", ["databricks-rg", azurerm_resource_group.repro.name]) managed_services_cmk_key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_key.id customer_managed_key_enabled = true infrastructure_encryption_enabled = true public_network_access_enabled = false network_security_group_rules_required = "NoAzureDatabricksRules" custom_parameters { no_public_ip = true public_subnet_name = azurerm_subnet.publichost_subnet.name private_subnet_name = azurerm_subnet.privatecontainer_subnet.name virtual_network_id = azurerm_virtual_network.network.id public_subnet_network_security_group_association_id = azurerm_subnet.publichost_subnet.id private_subnet_network_security_group_association_id = azurerm_subnet.privatecontainer_subnet.id } lifecycle { ignore_changes = [custom_parameters] } } resource "azurerm_key_vault_access_policy" "databricks" { depends_on = [azurerm_databricks_workspace.repro] key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_databricks_workspace.repro.storage_account_identity.0.tenant_id object_id = azurerm_databricks_workspace.repro.storage_account_identity.0.principal_id key_permissions = [ "Get", "GetRotationPolicy", "UnwrapKey", "WrapKey", "Delete", ] } resource "azurerm_key_vault" "repro" { name = "repro1keyvault" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name tenant_id = data.azurerm_client_config.current.tenant_id sku_name = "premium" purge_protection_enabled = true soft_delete_retention_days = 7 } resource "azurerm_key_vault_key" "databricks_encrypt_dbfs_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-dbfs-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_key" "databricks_encrypt_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-databricks-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_key" "databricks_encrypt_disk_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-disk-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_access_policy" "terraform" { key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_key_vault.repro.tenant_id object_id = data.azurerm_client_config.current.object_id key_permissions = [ "Get", "List", "Create", "Decrypt", "Encrypt", "Sign", "UnwrapKey", "Verify", "WrapKey", "Delete", "Restore", "Recover", "Update", "Purge", "GetRotationPolicy", ] secret_permissions = [ "Purge", ] } resource "azurerm_key_vault_access_policy" "notebook" { key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_key_vault.repro.tenant_id object_id = "fe597bb2-377c-44f1-8515-82c8a1a62e3d" key_permissions = [ "Get", "UnwrapKey", "WrapKey", "GetRotationPolicy", "Decrypt", "Encrypt", "Update", "Create", "Delete", ] } ```

Once that is provisioned the encryption blade in your Workspace should look like below:

I then added the DBFS and the Managed Disk encryption configuration like this:

Root DBFS and Managed Disk CMK configuration

I added the below configuration values to the `azurerm_databricks_workspace` resources configuration: ```hcl managed_disk_cmk_key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_disk_key.id managed_disk_cmk_rotation_to_latest_version_enabled = true ``` I then added the `azurerm_databricks_workspace_customer_managed_key` configuration code block to include the `DBFS` encryption key to the Databricks Workspace as below: ```hcl resource "azurerm_databricks_workspace_customer_managed_key" "databricks_DBFS" { depends_on = [azurerm_key_vault_access_policy.databricks] workspace_id = azurerm_databricks_workspace.repro.id key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_dbfs_key.id } ```

I then did an apply and allowed it to deploy the new infrastructure. Once that completes successfully your encryption blade in your Workspace should look like this:

So if I read your repro step correctly we are now in the same state your environment was in before you attempted to revert the CMK keys for managed Disk and DBFS, is that correct?

I then removed the managed Disk CMK configuration values from the azurerm_databricks_workspace resource:

Remove managed disk CMK from configuration

```hcl # managed_disk_cmk_key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_disk_key.id # managed_disk_cmk_rotation_to_latest_version_enabled = true" ```

Which resulted in the error you mentioned in your repro steps above:

I believe this error is by design, but I will check with the service team to see what the expected behavior is supposed to be once you set the managed disk CMK on a workspace and then attempt to remove it after it has already been set.

Now that I have received, the Existing Disk Encryption Properties must be specified in the PUT request. error. I then attempted to remove the DBFS encryption settings from the workspace as below:

Remove DBFS encryption settings

```hcl # resource "azurerm_databricks_workspace_customer_managed_key" "databricks_DBFS" { # depends_on = [azurerm_key_vault_access_policy.databricks] # workspace_id = azurerm_databricks_workspace.repro.id # key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_dbfs_key.id # } ```

I then run an apply which successfully removes the DBFS encryption settings, but fails with the same error message as above when it attempt to remove the managed disk CMK settings again. Once the apply fails my encryption blade for my Workspace looks like this:

As you can see from my attempted repro steps above I was not able to reproduce either of the below reported errors:

│ Error: updating Workspace (Subscription: 'XXXXXXXXXXXXXXXX'
│ Resource Group Name: 'XXXX-rg'
│ Workspace Name: 'databricks') Tags: performing Update: unexpected status 400 with error: ApplicationUpdateFail: Failed to update application: 'databricks', because patch resource group failure.

polling after create or update: internal-error: a polling status of failed should be surfaced as a polling failed error

aparna-reji commented 1 year ago

@aparna-reji, Yes, the polling error is coming back from the update call. In the provider we make the update call then we poll the LRO until it is complete. From the above it appears an error has happened while we were polling for completion of the update call.

May I please know what is an LRO?

WodansSon commented 1 year ago

@aparna-reji, Sure, it's short for Long Running Operation.

msaranga commented 1 year ago

@WodansSon @aparna-reji Currently we don't support disabling of CMK for Disk and Managed Services (aka CMK for Notebook). We expect clients to pass CMK information during every workspace update.

aparna-reji commented 1 year ago

@msaranga / @WodansSon Actually i dont want to disable of CMK for Disk and Managed Services .

@aparna-reji, I am not able to reproduce your issue locally except for the performing CreateOrUpdate: unexpected status 400 with error: DiskEncryptionPropertiesRequired: Existing Disk Encryption Properties must be specified in the PUT request error which I believe is by design because once you encrypt the workspace you cannot undue it without destroying it and recreating it again without the encryption.

Just for verification, here are the steps I took based off how I understood your repro case above. I created the workspace and all of the supporting resources like this, so this is the state I believe your workspace was in before you added the DBFS and Managed Disk CMK encryption keys:

Initial Databricks Workspace configuration Once that is provisioned the encryption blade in your Workspace should look like below:

I then added the DBFS and the Managed Disk encryption configuration like this:

Root DBFS and Managed Disk CMK configuration I then did an apply and allowed it to deploy the new infrastructure. Once that completes successfully your encryption blade in your Workspace should look like this:

So if I read your repro step correctly we are now in the same state your environment was in before you attempted to revert the CMK keys for managed Disk and DBFS, is that correct?

I then removed the managed Disk CMK configuration values from the azurerm_databricks_workspace resource:

Remove managed disk CMK from configuration Which resulted in the error you mentioned in your repro steps above:

I believe this error is by design, but I will check with the service team to see what the expected behavior is supposed to be once you set the managed disk CMK on a workspace and then attempt to remove it after it has already been set.

Now that I have received, the Existing Disk Encryption Properties must be specified in the PUT request. error. I then attempted to remove the DBFS encryption settings from the workspace as below:

Remove DBFS encryption settings I then run an apply which successfully removes the DBFS encryption settings, but fails with the same error message as above when it attempt to remove the managed disk CMK settings again. Once the apply fails my encryption blade for my Workspace looks like this:

As you can see from my attempted repro steps above I was not able to reproduce either of the below reported errors:
│ Error: updating Workspace (Subscription: 'XXXXXXXXXXXXXXXX'
│ Resource Group Name: 'XXXX-rg'
│ Workspace Name: 'databricks') Tags: performing Update: unexpected status 400 with error: ApplicationUpdateFail: Failed to update application: 'databricks', because patch resource group failure.
polling after create or update: internal-error: a polling status of failed should be surfaced as a polling failed error

The Initial Databricks Workspace configuration looks correct Adding Managed Disk CMK configuration also looks correct, except for the fact that when i set managed_disk_cmk_rotation_to_latest_version_enabled = true , I get error. Thats when i first got

│ Error: updating Workspace (Subscription: 'XXXXXXXXXXXXXXXX'
│ Resource Group Name: 'XXXX-rg'
│ Workspace Name: 'databricks') Tags: performing Update: unexpected status 400 with error: ApplicationUpdateFail: Failed to update application: 'databricks', because patch resource group failure.

and later when i retried by setting managed_disk_cmk_rotation_to_latest_version_enabled = false , the terraform apply ran successfully and when i retried to set managed_disk_cmk_rotation_to_latest_version_enabled = true with managed services, dbfs and disk cmk's , got errorpolling after create or update: internal-error: a polling status of failed should be surfaced as a polling failed error

And i didnt attempted to revert the CMK keys for managed Disk and DBFS.

The issue is that i cannot successfully add managed disk cmk to an existing workspace by setting managed_disk_cmk_rotation_to_latest_version_enabled = true . I can only add managed cmk if i set managed_disk_cmk_rotation_to_latest_version_enabled = false and keep it like that

msaranga commented 1 year ago

@aparna-reji : Can you please fill support ticket or share with us Databricks workspace id or workspace resource ID for troubleshooting

aparna-reji commented 1 year ago

@msaranga Just to double confirm, will you be contacting databricks on the same for getting further troubleshooting details with the workspace ID?

msaranga commented 1 year ago

@aparna-reji : I'm from Databricks Engg Team. If you can provide us workspace Resource ID. It will be in this format

/subscriptions/{SubID}/resourceGroups/{RGName}/providers/Microsoft.Databricks/workspaces/{WSName}

WodansSon commented 1 year ago

@aparna-reji, thanks for the reply. It appears I am still not understanding your repro case. I just attempted to repro what I believe you are describing and I was not able to get the error that you have reported.

Step 1

Provision a workspace without managed_disk_cmk_rotation_to_latest_version_enabled or the azurerm_databricks_workspace_customer_managed_key resource defined in the configuration file.

Step 1 Configuration - click to expand

```hcl provider "azurerm" { features { resource_group { prevent_deletion_if_contains_resources = false } } } data "azurerm_client_config" "current" {} resource "azurerm_resource_group" "repro" { name = "repro-22394-resources" location = "East US 2" } resource "azurerm_subnet" "publichost_subnet" { name = "public-subnet" resource_group_name = azurerm_resource_group.repro.name virtual_network_name = azurerm_virtual_network.network.name address_prefixes = ["10.0.1.0/24"] delegation { name = "delegation" service_delegation { name = "Microsoft.Databricks/workspaces" actions = ["Microsoft.Network/virtualNetworks/subnets/join/action", "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] } } } resource "azurerm_subnet" "privatecontainer_subnet" { name = "private-subnet" resource_group_name = azurerm_resource_group.repro.name virtual_network_name = azurerm_virtual_network.network.name address_prefixes = ["10.0.2.0/24"] delegation { name = "delegation" service_delegation { name = "Microsoft.Databricks/workspaces" actions = ["Microsoft.Network/virtualNetworks/subnets/join/action", "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] } } } resource "azurerm_virtual_network" "network" { name = "databricks-network" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name address_space = ["10.0.0.0/16"] tags = { "SkipASMAzSecPack" = "true" "SkipASMAzSecPackAutoConfig" = "true" } } resource "azurerm_network_security_group" "public_nsg" { name = "public-nsg" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name } resource "azurerm_network_security_group" "private_nsg" { name = "private-nsg" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name } resource "azurerm_subnet_network_security_group_association" "public_nsg_assoc" { subnet_id = azurerm_subnet.publichost_subnet.id network_security_group_id = azurerm_network_security_group.public_nsg.id } resource "azurerm_subnet_network_security_group_association" "private_nsg_assoc" { subnet_id = azurerm_subnet.privatecontainer_subnet.id network_security_group_id = azurerm_network_security_group.private_nsg.id } resource "azurerm_databricks_workspace" "repro" { depends_on = [azurerm_key_vault_access_policy.notebook, azurerm_subnet_network_security_group_association.public_nsg_assoc, azurerm_subnet_network_security_group_association.private_nsg_assoc] name = "databricks-repro" resource_group_name = azurerm_resource_group.repro.name location = azurerm_resource_group.repro.location sku = "premium" managed_resource_group_name = join("-", ["databricks-rg", azurerm_resource_group.repro.name]) managed_services_cmk_key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_key.id customer_managed_key_enabled = true infrastructure_encryption_enabled = true public_network_access_enabled = false network_security_group_rules_required = "NoAzureDatabricksRules" custom_parameters { no_public_ip = true public_subnet_name = azurerm_subnet.publichost_subnet.name private_subnet_name = azurerm_subnet.privatecontainer_subnet.name virtual_network_id = azurerm_virtual_network.network.id public_subnet_network_security_group_association_id = azurerm_subnet.publichost_subnet.id private_subnet_network_security_group_association_id = azurerm_subnet.privatecontainer_subnet.id } lifecycle { ignore_changes = [custom_parameters] } } resource "azurerm_key_vault_access_policy" "databricks" { depends_on = [azurerm_databricks_workspace.repro] key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_databricks_workspace.repro.storage_account_identity.0.tenant_id object_id = azurerm_databricks_workspace.repro.storage_account_identity.0.principal_id key_permissions = [ "Get", "GetRotationPolicy", "UnwrapKey", "WrapKey", "Delete", ] } resource "azurerm_key_vault" "repro" { name = "repro1keyvault" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name tenant_id = data.azurerm_client_config.current.tenant_id sku_name = "premium" purge_protection_enabled = true soft_delete_retention_days = 7 } resource "azurerm_key_vault_key" "databricks_encrypt_dbfs_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-dbfs-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_key" "databricks_encrypt_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-databricks-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_key" "databricks_encrypt_disk_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-disk-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_access_policy" "terraform" { key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_key_vault.repro.tenant_id object_id = data.azurerm_client_config.current.object_id key_permissions = [ "Get", "List", "Create", "Decrypt", "Encrypt", "Sign", "UnwrapKey", "Verify", "WrapKey", "Delete", "Restore", "Recover", "Update", "Purge", "GetRotationPolicy", ] secret_permissions = [ "Purge", ] } resource "azurerm_key_vault_access_policy" "notebook" { key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_key_vault.repro.tenant_id object_id = "fe597bb2-377c-44f1-8515-82c8a1a62e3d" key_permissions = [ "Get", "UnwrapKey", "WrapKey", "GetRotationPolicy", "Decrypt", "Encrypt", "Update", "Create", "Delete", ] } ```

Step 2

I wait for step 1 to finish provisioning and then add in the managed_disk_cmk_rotation_to_latest_version_enabled and the azurerm_databricks_workspace_customer_managed_key resource into the configuration file. Which generates a plan as below:

Step 2 Configuration - click to expand

```hcl provider "azurerm" { features { resource_group { prevent_deletion_if_contains_resources = false } } } data "azurerm_client_config" "current" {} resource "azurerm_resource_group" "repro" { name = "repro-22394-resources" location = "East US 2" } resource "azurerm_subnet" "publichost_subnet" { name = "public-subnet" resource_group_name = azurerm_resource_group.repro.name virtual_network_name = azurerm_virtual_network.network.name address_prefixes = ["10.0.1.0/24"] delegation { name = "delegation" service_delegation { name = "Microsoft.Databricks/workspaces" actions = ["Microsoft.Network/virtualNetworks/subnets/join/action", "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] } } } resource "azurerm_subnet" "privatecontainer_subnet" { name = "private-subnet" resource_group_name = azurerm_resource_group.repro.name virtual_network_name = azurerm_virtual_network.network.name address_prefixes = ["10.0.2.0/24"] delegation { name = "delegation" service_delegation { name = "Microsoft.Databricks/workspaces" actions = ["Microsoft.Network/virtualNetworks/subnets/join/action", "Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action", "Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"] } } } resource "azurerm_virtual_network" "network" { name = "databricks-network" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name address_space = ["10.0.0.0/16"] tags = { "SkipASMAzSecPack" = "true" "SkipASMAzSecPackAutoConfig" = "true" } } resource "azurerm_network_security_group" "public_nsg" { name = "public-nsg" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name } resource "azurerm_network_security_group" "private_nsg" { name = "private-nsg" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name } resource "azurerm_subnet_network_security_group_association" "public_nsg_assoc" { subnet_id = azurerm_subnet.publichost_subnet.id network_security_group_id = azurerm_network_security_group.public_nsg.id } resource "azurerm_subnet_network_security_group_association" "private_nsg_assoc" { subnet_id = azurerm_subnet.privatecontainer_subnet.id network_security_group_id = azurerm_network_security_group.private_nsg.id } resource "azurerm_databricks_workspace" "repro" { depends_on = [azurerm_key_vault_access_policy.notebook, azurerm_subnet_network_security_group_association.public_nsg_assoc, azurerm_subnet_network_security_group_association.private_nsg_assoc] name = "databricks-repro" resource_group_name = azurerm_resource_group.repro.name location = azurerm_resource_group.repro.location sku = "premium" managed_resource_group_name = join("-", ["databricks-rg", azurerm_resource_group.repro.name]) managed_services_cmk_key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_key.id managed_disk_cmk_key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_disk_key.id managed_disk_cmk_rotation_to_latest_version_enabled = true customer_managed_key_enabled = true infrastructure_encryption_enabled = true public_network_access_enabled = false network_security_group_rules_required = "NoAzureDatabricksRules" custom_parameters { no_public_ip = true public_subnet_name = azurerm_subnet.publichost_subnet.name private_subnet_name = azurerm_subnet.privatecontainer_subnet.name virtual_network_id = azurerm_virtual_network.network.id public_subnet_network_security_group_association_id = azurerm_subnet.publichost_subnet.id private_subnet_network_security_group_association_id = azurerm_subnet.privatecontainer_subnet.id } lifecycle { ignore_changes = [custom_parameters] } } resource "azurerm_databricks_workspace_root_dbfs_customer_managed_key" "databricks_DBFS" { depends_on = [azurerm_key_vault_access_policy.databricks] workspace_id = azurerm_databricks_workspace.repro.id key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_dbfs_key.id } resource "azurerm_key_vault_access_policy" "databricks" { depends_on = [azurerm_databricks_workspace.repro] key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_databricks_workspace.repro.storage_account_identity.0.tenant_id object_id = azurerm_databricks_workspace.repro.storage_account_identity.0.principal_id key_permissions = [ "Get", "GetRotationPolicy", "UnwrapKey", "WrapKey", "Delete", ] } resource "azurerm_key_vault" "repro" { name = "repro1keyvault" location = azurerm_resource_group.repro.location resource_group_name = azurerm_resource_group.repro.name tenant_id = data.azurerm_client_config.current.tenant_id sku_name = "premium" purge_protection_enabled = true soft_delete_retention_days = 7 } resource "azurerm_key_vault_key" "databricks_encrypt_dbfs_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-dbfs-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_key" "databricks_encrypt_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-databricks-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_key" "databricks_encrypt_disk_key" { depends_on = [azurerm_key_vault_access_policy.terraform] name = "repro-disk-certificate" key_vault_id = azurerm_key_vault.repro.id key_type = "RSA" key_size = 2048 key_opts = [ "decrypt", "encrypt", "sign", "unwrapKey", "verify", "wrapKey", ] } resource "azurerm_key_vault_access_policy" "terraform" { key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_key_vault.repro.tenant_id object_id = data.azurerm_client_config.current.object_id key_permissions = [ "Get", "List", "Create", "Decrypt", "Encrypt", "Sign", "UnwrapKey", "Verify", "WrapKey", "Delete", "Restore", "Recover", "Update", "Purge", "GetRotationPolicy", ] secret_permissions = [ "Purge", ] } resource "azurerm_key_vault_access_policy" "notebook" { key_vault_id = azurerm_key_vault.repro.id tenant_id = azurerm_key_vault.repro.tenant_id object_id = "fe597bb2-377c-44f1-8515-82c8a1a62e3d" key_permissions = [ "Get", "UnwrapKey", "WrapKey", "GetRotationPolicy", "Decrypt", "Encrypt", "Update", "Create", "Delete", ] } ```

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create
  ~ update in-place

Terraform will perform the following actions:

  # azurerm_databricks_workspace.repro will be updated in-place
  ~ resource "azurerm_databricks_workspace" "repro" {
        id                                                  = "/subscriptions/{subscription}/resourceGroups/repro-22394-resources/providers/Microsoft.Databricks/workspaces/databricks-repro"
      + managed_disk_cmk_key_vault_key_id                   = "https://repro1keyvault.vault.azure.net/keys/repro-disk-certificate/9fcac1dde3ce477caa4ae6c67851bff0"
      + managed_disk_cmk_rotation_to_latest_version_enabled = true
        name                                                = "databricks-repro"
        tags                                                = {}
        # (14 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

  # azurerm_databricks_workspace_root_dbfs_customer_managed_key.databricks_DBFS will be created
  + resource "azurerm_databricks_workspace_root_dbfs_customer_managed_key" "databricks_DBFS" {
      + id               = (known after apply)
      + key_vault_key_id = "https://repro1keyvault.vault.azure.net/keys/repro-dbfs-certificate/7797c6c0b478467992a7df431cd5bbe4"
      + workspace_id     = "/subscriptions/{subscription}/resourceGroups/repro-22394-resources/providers/Microsoft.Databricks/workspaces/databricks-repro"
    }

Plan: 1 to add, 1 to change, 0 to destroy.

I then apply the configuration changes which results in:

azurerm_databricks_workspace.repro: Still modifying... [id=/subscriptions/{subscription}...Databricks/workspaces/databricks-repro, 30s elapsed]
azurerm_databricks_workspace.repro: Modifications complete after 36s [id=/subscriptions/{subscription}/resourceGroups/repro-22394-resources/providers/Microsoft.Databricks/workspaces/databricks-repro]
azurerm_databricks_workspace_root_dbfs_customer_managed_key.databricks_DBFS: Creating...
azurerm_databricks_workspace_root_dbfs_customer_managed_key.databricks_DBFS: Still creating... [50s elapsed]
azurerm_databricks_workspace_root_dbfs_customer_managed_key.databricks_DBFS: Creation complete after 50s [id=/subscriptions/{subscription}/resourceGroups/repro-22394-resources/providers/Microsoft.Databricks/workspaces/databricks-repro]

Apply complete! Resources: 1 added, 1 changed, 0 destroyed.

Were my repo steps accurate and a reasonable facsimile to the steps you took in your environment? The one thing that was not clear to me in your issues configuration file was if you created all of the azurerm_key_vault_access_policy resources needed to successfully enable the CMK scenario. If you look at my configuration files you will see that I have 3 key vault access policies defined (e.g., notebook, terraform, and databricks) that allow Terraform and Databricks permissions to access the Key Vault Keys. I have pulled the resource definitions from the above Step 2 configuration file to make it easier to point out the DBFS key vault access policy, please see below.

resource "azurerm_databricks_workspace_root_dbfs_customer_managed_key" "databricks_DBFS" {
  depends_on = [azurerm_key_vault_access_policy.databricks]

  workspace_id     = azurerm_databricks_workspace.repro.id
  key_vault_key_id = azurerm_key_vault_key.databricks_encrypt_dbfs_key.id
}

resource "azurerm_key_vault_access_policy" "databricks" {
  depends_on = [azurerm_databricks_workspace.repro]

  key_vault_id = azurerm_key_vault.repro.id
  tenant_id    = azurerm_databricks_workspace.repro.storage_account_identity.0.tenant_id
  object_id    = azurerm_databricks_workspace.repro.storage_account_identity.0.principal_id

  key_permissions = [
    "Get",
    "GetRotationPolicy",
    "UnwrapKey",
    "WrapKey",
    "Delete",
  ]
}

NOTE: The azurerm_databricks_workspace_root_dbfs_customer_managed_key resource is just the azurerm_databricks_workspace_customer_managed_key resource that has been renamed in my private branch I used to repo this issue.

github-actions[bot] commented 6 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

hashicorp / terraform-provider-azurerm