databricks / terraform-provider-databricks

Databricks Terraform Provider
https://registry.terraform.io/providers/databricks/databricks/latest
Other
428 stars 368 forks source link

[ISSUE] `databricks_cluster` resource import does not include attached jar libraries #3661

Open nchammas opened 1 month ago

nchammas commented 1 month ago

Configuration

import {
  to = databricks_cluster.my_cluster
  id = "some-cluster-id"
}

Expected Behavior

There is a jar library attached to this cluster under the "Libraries" tab in the web UI. When I run terraform plan -generate-config-out=cluster.tf that library should be imported as follows:

resource "databricks_cluster" "my_cluster" {
  ...
  library {
    jar = "dbfs:/some-path.jar"
  }
}

Actual Behavior

The library is completely ignored during import.

Steps to Reproduce

  1. Create a cluster manually and attach a jar library to it.
  2. Import that cluster into a databricks_cluster resource and let Terraform generate the config for it.
  3. Confirm that there is no library block within the generated config for the cluster.

Terraform and provider versions

$ terraform version
Terraform v1.8.5
on darwin_amd64
+ provider registry.terraform.io/databricks/databricks v1.47.0

Is it a regression?

I don't believe so.

Debug Output

There is no mention of libraries in the debug output. I can clean up and share this output if you still think it would be helpful.

Important Factoids

None.

Would you like to implement a fix?

No.

alexott commented 1 month ago

most probably it's related to #3558... But also, instead of terraform plan -generate-config-out it's better to use Exporter that will handle dependencies as well

nchammas commented 1 month ago

Thanks for the reference. Agreed, it's likely related and may even have the same underlying cause. (I somehow missed that issue when I did a search.)

With regards to the exporter, is it not a goal of this project to have the "native" Terraform import work correctly? Or are you just sharing practical advice that this standalone exporter tends to work better?

alexott commented 1 month ago

The culprit for your problem this line: https://github.com/databricks/terraform-provider-databricks/blob/main/clusters/resource_cluster.go#L481 - it was added to allow people to add libraries without removing them. But when you read a resource without corresponding TF code, the number of libraries is always 0.

The main problem with -generate-config-out is it is "dumb" - it doesn't handle references to other objects. I.e., if you're using instance pools in your cluster, it will just put the instance pool ID string into the code, making your code work only for your specific workspace. While Exporter will generate a resource block for instance pool and replace ID with the reference to that instance pool.

nchammas commented 1 month ago

The main problem with -generate-config-out is it is "dumb" - it doesn't handle references to other objects

What does "object" mean in the context of Terraform? From what I understand, library is a configuration block within the databricks_cluster resource. It isn't its own resource.

If it were its own resource, I would understand Terraform not generating the config for it, since my import block only targeted the databricks_cluster resource. But since library is just an attribute of the databricks_cluster resource, I expect an import of that resource to include all of its attributes.

I see there is a databricks_library resource, which creates a bit of ambiguity for the user trying to do an import. I suppose the behavior I'm seeing is an indication that a library is really its own resource, even though it can be expressed as an attribute of the cluster resource.

Perhaps then all that's needed is a note in the import documentation for the cluster resource about how library (and maybe some other things too) will only be imported as standalone resources (which therefore need their own import blocks) and not as attributes of the cluster resource.