[ISSUE] Issue with `databricks_metastore_assignment` resource

ethompsy commented 1 year ago

Hello! 👋

I am having some (apparently intermittent) issues attaching metastores to workspaces.

Configuration

Just following the guide here:

https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/unity-catalog#create-a-unity-catalog-metastore-and-link-it-to-workspaces

resource "databricks_mws_workspaces" "this" {
  provider        = databricks.mws
  account_id      = var.databricks_account_id
  aws_region      = local.region
  workspace_name  = local.dbr_prefix
  deployment_name = var.workspace_url

  credentials_id           = databricks_mws_credentials.this.credentials_id
  storage_configuration_id = databricks_mws_storage_configurations.this.storage_configuration_id
  network_id               = databricks_mws_networks.this.network_id

  token {
    comment = "Terraform PAT"
  }
}

resource "time_sleep" "wait_for_workspace" {
  depends_on = [ databricks_mws_workspaces.this ]
  create_duration = "2m"
}

resource "databricks_metastore_assignment" "this" {
  provider              = databricks.mws
  metastore_id          = var.databricks_metastore_id
  workspace_id          = databricks_mws_workspaces.this.workspace_id
  default_catalog_name  = ""
  depends_on = [ time_sleep.wait_for_workspace ]
}

Expected Behavior

I would expect that the Databricks Terraform Provider would attach a metastore to a workspace that was just created.

Actual Behavior

Instead this error is thrown:

│ Error: cannot update metastore assignment: Workspace is not in the same region as metastore. Workspace region: Unknown, Metastore region: us-east-1
│ 
│   with module.bi-workspace.databricks_metastore_assignment.this,
│   on .terraform/modules/bi-workspace/main.tf line 56, in resource "databricks_metastore_assignment" "this":
│   56: resource "databricks_metastore_assignment" "this" {
│

If I look in the Account UI I can see that the Workspace is known to be in us-east-1.

Steps to Reproduce

terraform apply

Terraform and provider versions

$ terraform version
Terraform v1.6.2
on darwin_arm64
+ provider registry.terraform.io/databricks/databricks v1.29.0
+ provider registry.terraform.io/hashicorp/aws v5.21.0
+ provider registry.terraform.io/hashicorp/time v0.9.1

Is it a regression?

I have tried v1.29.0, v1.28.1, v1.28.0, v1.27.0, and v1.26.0 of the databricks provider. I am seeing the same issue in all of these so far.

Debug Output

2023-11-06T14:20:49.610-0500 [DEBUG] provider.terraform-provider-databricks_v1.29.0: PUT /api/2.0/accounts/<account_id>/workspaces/<workspace_id>/metastores/<metastore_id>
> {
>   "metastore_assignment": {
>     "metastore_id": "<metastore_id>"
>   }
> }
< HTTP/2.0 400 Bad Request
< {
<   "details": [
<     {
<       "@type": "type.googleapis.com/google.rpc.ErrorInfo",
<       "domain": "unity-catalog.databricks.com",
<       "metadata": {
<         "msg": "Workspace is not in the same region as metastore. Workspace region: Unknown, Metastore region: u... (8 more bytes)"
<       },
<       "reason": "INVALID_PARAMETER_VALUE"
<     },
<     {
<       "@type": "type.googleapis.com/google.rpc.RequestInfo",
<       "request_id": "5342a9cc-d485-4111-bcbf-2bbc218b4946",
<       "serving_data": ""
<     }
<   ],
<   "error_code": "INVALID_PARAMETER_VALUE",
<   "message": "Workspace is not in the same region as metastore. Workspace region: Unknown, Metastore region: u... (8 more bytes)"
< }: @module=databricks tf_resource_type=databricks_metastore_assignment @caller=/home/runner/work/terraform-provider-databricks/terraform-provider-databricks/logger/logger.go:33 tf_req_id=4266bcfd-c79a-1a46-e484-f85079e29c74 tf_rpc=ApplyResourceChange tf_provider_addr=registry.terraform.io/databricks/databricks timestamp=2023-11-06T14:20:49.609-0500

Important Factoids

We have been a Databricks customer for a long time and our accounts predate Unity Catalog and Account level SSO and many other features.

ethompsy commented 1 year ago

Looking at the debug output and comparing it to the Databricks API documentation I think the issue is the use of PUT here. The API docs specify the use of POST on creation of workspace assignment to a metastore:

https://docs.databricks.com/api/account/accountmetastoreassignments/create

I can recreate this using curl. If I try to create this same workspace assignment to the metastore using PUT I get an HTTP 400 error like this:

< HTTP/2 400 
< server: databricks
< date: Mon, 06 Nov 2023 20:06:34 GMT
< content-type: application/json; charset=utf-8
< content-length: 62
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-content-type-options: nosniff
< vary: Accept-Encoding
< 
{ [62 bytes data]
100    62  100    62    0     0    116      0 --:--:-- --:--:-- --:--:--   117
* Connection #0 to host accounts.cloud.databricks.com left intact
{
  "error_code": "BAD_REQUEST",
  "message": "Invalid UUID string: "
}

If I make the same call but change to a POST request it is successful and returns an HTTP 200 response.

ethompsy commented 1 year ago

It seems that the provider is trying to migrate the metastore assignment from a now deleted workspace to the current workspace. I cannot imagine that this concept is compatible with the API?

Terraform will perform the following actions:

  # module.data-hub-workspace.databricks_metastore_assignment.this will be updated in-place
  ~ resource "databricks_metastore_assignment" "this" {
        id           = "8410422733597792|80975123-190b-4c3b-9c8b-b659c9363e97"
      ~ workspace_id = 8410422733597792 -> 4568212316559807
        # (1 unchanged attribute hidden)
    }

In this case workspace with ID 8410422733597792 does not exist. The effect of this proposed change would be a call to the API using PUT as shown in the debug log. However, the resource that is being updated does not yet exist.

In light of this I removed the the module.data-hub-workspace.databricks_metastore_assignment.this object from the state and ran apply again. The assignment was created this time. It seems this issue might be caused by code changes that result in changes that rebuild the workspace. Such as I was going through during development.

ethompsy commented 1 year ago

There is a workaround for this. It is not ideal but it works. Since the databricks_metastore_assignment is not really an independent object but a configuration of two independent objects, whenever this error comes up:

│ Error: cannot update metastore assignment: Workspace is not in the same region as metastore. Workspace region: Unknown, Metastore region: us-east-1
│ 
│   with module.bi-workspace.databricks_metastore_assignment.this,
│   on .terraform/modules/bi-workspace/main.tf line 56, in resource "databricks_metastore_assignment" "this":
│   56: resource "databricks_metastore_assignment" "this" {
│

It is safe to just drop it from the state and run apply again.

$ terraform state rm module.bi-workspace.databricks_metastore_assignment.this

Not ideal, but it works for now.

mgyucht commented 1 year ago

@ethompsy we are merging a fix for this today and will try to include it in the next TF release today or Monday at the latest.

databricks / terraform-provider-databricks