databricks / databricks-cli

(Legacy) Command Line Interface for Databricks
Other
381 stars 234 forks source link

Change default catalog to a unity_catalog doesn't work for clusters #666

Open Teun-Roefs opened 1 year ago

Teun-Roefs commented 1 year ago

Problem description

We've currently set up a couple of (unity) catalogs and require each catalog to be the default for a specific workspace. While trying to use the new Databricks CLI command databricks metastores assign METASTORE_ID DEFAULT_CATALOG_NAME WORKSPACE_ID, it succeeds and after retrieving the assignment list, we can see the default catalog has successfully been assigned to the correct workspace.

When running %sql SELECT current_catalog() on a fresh 13.2 cluster in the workspace, we still see hive_metastore as the default. When running the same command on a SQL Warehouse, the default_catalog is correct.

According to the documentation (https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/hive-metastore) it states:

All SQL warehouses and clusters will use this catalog as the default.

However, it seems it only works for the SQL warehouses.

When using spark.databricks.sql.initial.catalog.name configuration inside the compute cluster, it works as expected.

Is this a bug?

How to reproduce

  1. Assign a Unity Catalog as default by running databricks metastores assign METASTORE_ID DEFAULT_CATALOG_NAME WORKSPACE_ID
  2. Run %sql SELECT current_catalog() inside a notebook on a fresh 13.2 cluster in the workspace you've assigned the catalog to
  3. Result is hive_metastore while it should be your assigned catalog
mgyucht commented 10 months ago

Hi @Teun-Roefs. I believe this is an issue with the new CLI, hosted at https://github.com/databricks/cli. Please file your issue there for triage.