databricks / cli

Databricks CLI
Other
132 stars 50 forks source link

Added support for creating all-purpose clusters #1698

Closed andrewnester closed 1 week ago

andrewnester commented 1 month ago

Changes

Added support for creating all-purpose clusters

Example of configuration

bundle:
  name: clusters

resources:
  clusters:
    test_cluster:
      cluster_name: "Test Cluster"
      num_workers: 2
      node_type_id: "i3.xlarge"
      autoscale:
        min_workers: 2
        max_workers: 7
      spark_version: "13.3.x-scala2.12"
      spark_conf:
        "spark.executor.memory": "2g"

  jobs:
    test_job:
      name: "Test Job"
      tasks:
        - task_key: test_task
          existing_cluster_id: ${resources.clusters.test_cluster.id}
          notebook_task:
            notebook_path: "./src/test.py"

targets:
    development:
      mode: development
      compute_id: ${resources.clusters.test_cluster.id}

Tests

Added unit, config and E2E tests

andrewnester commented 1 month ago

@pietern it is started by default so indeed deployment takes longer in this case

andrewnester commented 1 month ago

How can folks use this with: bundle: compute_id: ${resources.clusters.my_cluster.id}

You can do this in target overrides but I'd better remove it from example because it is confusing, this is just to illustrate that reference string can be used for compute id

pietern commented 1 month ago

@andrewnester Is there a way we can avoid starting it (or waiting for it to start)?

andrewnester commented 1 month ago

@pietern unfortunately we can't as TF provider and Go SDK explicitly wait for cluster to be running https://github.com/databricks/terraform-provider-databricks/blob/main/clusters/resource_cluster.go#L413C2-L420

lennartkats-db commented 1 month ago

@andrewnester Waiting for it to start indeed seems pretty unfortunate, especially for development ... Does that happen every time, or just when it's first created?

lennartkats-db commented 1 month ago

You can do this in target overrides but I'd better remove it from example because it is confusing, this is just to illustrate that reference string can be used for compute id

But that would work, right? Setting bundle.cluster_id? Otherwise these all-purpose clusters are quite hard to use. You can't use a regular override to override a job cluster (new_cluster) with an all-purpose cluster (existing_cluster_id)

[edit]I opened a thread on this below[/edit]

pietern commented 1 month ago

Adding the label because we need the TF-side change to be released first.