databricks / terraform-provider-databricks

Databricks Terraform Provider
https://registry.terraform.io/providers/databricks/databricks/latest
Other
456 stars 393 forks source link

[ISSUE] Issue with `databricks_cluster` resource when create cluster from Shared cluster policy family #4108

Open qnix-databricks opened 1 month ago

qnix-databricks commented 1 month ago

Configuration

terraform {
  required_providers {
    databricks = {
      source = "databricks/databricks"
    }
  }
}

provider "databricks" {
  config_file = "/Users/quan.ta/project/tf/terraform-databricks-examples/.databrickscfg"
  profile = "DEFAULT"
  # DEFAULT profile refers this workspace: https://e2-demo-field-eng.cloud.databricks.com
}

module "shared_cluster_policies_dev_developer" {
  source = "../../modules/group_cluster_policies"

  # Builtin cluster policy family to start from (required).
  family = "Shared"

  # Group name (required).
  group_name = "qta_dev"

  # Policy overrides, can be removed to use the defaults.
  overrides = {
    autotermination_minutes = {
      type   = "fixed"
      value  = 30
      hidden = true
    },
    data_security_mode = {
      type   = "fixed"
      value  = "USER_ISOLATION"
      hidden = true
    }
  }

  # Remove this to use the default (small, medium, large).
  sizes_enabled = {
    "small"  = 16
    "medium" = 32
    "large" = 64
  }

  # Tags are converted to overrides and merged with them (required).
  tags = {
    business_unit = "DNA"
    department    = "IT-EA"
    team          = "BI-DW"
    application   = "UDL"
  }
}

data "databricks_spark_version" "latest_lts" {
  long_term_support = true
}

resource "databricks_cluster" "dev_shared_clusters" {
  for_each = module.shared_cluster_policies_dev_developer.policies

  cluster_name = "Dev Shared ${each.value.cluster_policy.name})"
  spark_version = data.databricks_spark_version.latest_lts.id
  policy_id = each.value.cluster_policy.id
  apply_policy_default_values = true
}

Please find the module definitions in the attached archive: modules.tgz

Expected Behavior

Should be able to create cluster from the Shared cluster policy

Actual Behavior

│ Error: cannot create cluster: NumWorkers could be 0 only for SingleNode clusters. See https://docs.databricks.com/clusters/single-node.html for more details
│
│   with databricks_cluster.dev_shared_clusters["medium"],
│   on cluster_policies.tf line 49, in resource "databricks_cluster" "dev_shared_clusters":
│   49: resource "databricks_cluster" "dev_shared_clusters" {
│
╵

Noted that the code does not specify num_workers. I tried with both setting apply policy default to true and false, and not specify it at all.

Here is the output of terraform plan:

  # databricks_cluster.dev_shared_clusters["small"] will be created
  + resource "databricks_cluster" "dev_shared_clusters" {
      + autotermination_minutes      = 60
      + cluster_id                   = (known after apply)
      + cluster_name                 = "Dev Shared shared - qta_dev - small)"
      + default_tags                 = (known after apply)
      + driver_instance_pool_id      = (known after apply)
      + driver_node_type_id          = (known after apply)
      + enable_elastic_disk          = (known after apply)
      + enable_local_disk_encryption = (known after apply)
      + id                           = (known after apply)
      + node_type_id                 = (known after apply)
      + num_workers                  = 0
      + policy_id                    = "00002C211A3A8831"
      + spark_version                = "15.4.x-scala2.12"
      + state                        = (known after apply)
      + url                          = (known after apply)
    }

I noted that when create the cluster manually in the UI using the same Shared policy, it does not automatically add num_workers = 0, and it can create the cluster without issue.

Steps to Reproduce

  1. terraform apply

(the module code is attached.)

Terraform and provider versions

% tf version Terraform v1.9.7 on darwin_arm64

Is it a regression?

Debug Output

Important Factoids

Would you like to implement a fix?

qnix-databricks commented 1 month ago

Adding autoscale doesn't appear to change the num_workers = 0 behavior:

resource "databricks_cluster" "dev_shared_clusters" {
  for_each = module.shared_cluster_policies_dev_developer.policies

  cluster_name = "Dev Shared ${each.value.cluster_policy.name}"
  spark_version = data.databricks_spark_version.latest_lts.id
  policy_id = each.value.cluster_policy.id
  apply_policy_default_values = false

  autoscale {
    min_workers = 1
    max_workers = 50
  }
}

terraform plan


# databricks_cluster.dev_shared_clusters["small"] will be created
+ resource "databricks_cluster" "dev_shared_clusters" {
+ apply_policy_default_values  = false
+ autotermination_minutes      = 60
+ cluster_id                   = (known after apply)
+ cluster_name                 = "Dev Shared shared - qta_dev - small"
+ default_tags                 = (known after apply)
+ driver_instance_pool_id      = (known after apply)
+ driver_node_type_id          = (known after apply)
+ enable_elastic_disk          = (known after apply)
+ enable_local_disk_encryption = (known after apply)
+ id                           = (known after apply)
+ node_type_id                 = (known after apply)
+ num_workers                  = 0
+ policy_id                    = "00002C211A3A8831"
+ spark_version                = "15.4.x-scala2.12"
+ state                        = (known after apply)
+ url                          = (known after apply)
  + autoscale {
      + max_workers = 50
      + min_workers = 1
    }
}