databricks / terraform-provider-databricks

Databricks Terraform Provider
https://registry.terraform.io/providers/databricks/databricks/latest
Other
427 stars 367 forks source link

[FEATURE] Ability to manage Starter Warehouse attributes from TF #3050

Open ACSmith07 opened 7 months ago

ACSmith07 commented 7 months ago

Use-cases

End goal is to manage the Starter Warehouse. At minimum permissions need to configurable. Ideally we could reconfigure to desired settings (2x small, serverless). A direct implementation would be best. Similar issue arises resetting default permissions on the default Compute Policies, but data providers correctly return ID for them.

Reconfigure Warehouse: Calling sql_endpoint with name = "Starter Warehouse" throws an error that the name already exists. So no configuration is possible.

Attempting to set permissions, permissions {sql_endpoint_id =} works. That's fine if hardcoding a known ID. However ID varies per workspace, so it needs to be provided by a data block (since the ID can't be a Terraform resource, having been auto-created.

Attempted Solutions

There seems to be a bug on data "sql_warehouse". I attempted to use data "sql_warehouse" { name_contains = "Starter Warehouse" } but it is returning an empty list. Note that the name_contains string was copied directly from calling data "sql_warehouses", feeding the ID to data "sql_warehouse" with a for_each, then copying the output value for name = "". So the string certainly exists as a name. But both output and errors say it is returning an empty list. I also attempted using partial strings "Starter" and "starter".

If sql_warehouses (optionally) returned a rich map like sql_warehouse rather than a list of IDs, list comprehension can filter the set of all warehouses on name directly. Alternately, I previously used sql_warehouse { name_contains } successfully - the approach works if the data object searches accurately.

The below approach allowed me to run permissions, by executing APPLY stepwise. However it must be run stepwise, or the 2nd step fails with error data.databricks_sql_warehouses.all_ids.ids is a set of string, known only after apply . This makes it un-viable. in practice.

  1. Generate a complete Warehouse ID list with sql_warehouses
  2. Generate rich warehouse data with sql_warehouse { for_each = sql_warehouses.all.ids }
  3. Extract the needed name/or ID using list comprehension on the output map: [ for k,v in data.databricks_sql_warehouse.starter : k if v.name == "Starter Warehouse" ]

Note that the code in the sql_warehouse documentation will not run, returning the error above regarding for_each on a set of unknown length. `data "databricks_sql_warehouses" "all" { }

data "databricks_sql_warehouse" "all" { for_each = data.databricks_sql.warehouses.ids id = each.value }`

Proposal

References

Output when using name_contains = "Starter Warehouse" (I'm looking at the warehouse in UX). `Outputs:

sqlcheck = { "id" = "" "ids" = toset([]) "warehouse_name_contains" = "Starter Warehouse" } `

nkvuong commented 7 months ago

@ACSmith07 does this block work for you?

data "databricks_sql_warehouse" "starter" {
  name = "Starter Warehouse"
}

output "starter_id" {
  value = data.databricks_sql_warehouse.starter.id
}

you are right that there is a bug with databricks_sql_warehouses data source around case sensitivity - basically warehouse_name_contains = "starter warehouse" would work but any uppercase would fail the check

ACSmith07 commented 7 months ago

--see update below first-- Thanks! That does work. I think I was being a bit daft and didn't see that singular sql_warehouse can take a name argument.

I confirmed the permissions resource acts on Starter Warehouse correctly. Thanks a ton! It allows me to faux-delete by limiting permission to admin-only. Which solves the immediate use case.

That leaves the problem that Starter Warehouse itself can't be modified. But that's a larger/separate ticket I imagine.

There also remains a slight bug on sql_warehouses (plural). Oddly, copying the exact name from your successful code into the plural-version provider still outputs an empty list. So this narrows down the specific bug at least. Code/Output below. I confirmed the behavior both in my larger project and in a test script pointing to a different workspace, with only the 2 data providers and the 2 outputs.

Code

`# Suggested data "databricks_sql_warehouse" "starter" { name = "Starter Warehouse" }

output "suggested_method" { value = data.databricks_sql_warehouse.starter.id }

Mine

data "databricks_sql_warehouses" "starter_warehouse" { warehouse_name_contains = "Starter Warehouse" }

output "my_method" { value = data.databricks_sql_warehouses.starter_warehouse }`

Output

`Apply complete! Resources: 5 added, 2 changed, 5 destroyed.

Outputs:

sqlmine = { "id" = "" "ids" = toset([]) "warehouse_name_contains" = "Starter Warehouse" } suggested_method = "7ab6f10d9ea3b3f7"`

ACSmith07 commented 7 months ago

Scratch that - found the import block availability on sql_endpoint. I've now successfully modified Starter Warehouse. Really sorry for all the noise here, probably shouldn't open issues at midnight.

The only remaining concern now is the inconsistent behavior string search between on sql_warehouses vs. sql_warehouse. With sql_warehouses (plural) seeming bugged.

Thanks a ton for the help.