databrickslabs / ucx

Automated migrations to Unity Catalog
Other
234 stars 80 forks source link

[BUG]: Failing in creating External Location with name containing dot(.) #2342

Closed SomanathSankara closed 3 months ago

SomanathSankara commented 3 months ago

Is there an existing issue for this?

Current Behavior

UCX Version:0.28.0

C:\Users\somanath.sankara>databricks labs ucx migrate-locations --subscription-id [UPGRADE ADVISED] Newer ucx version was released 1 days ago. Please run databricks labs upgrade ucx to upgrade: v0.28.0 -> v0.31.0 18:10:29 INFO [databricks.sdk] Using Azure CLI authentication with AAD tokens 18:10:29 WARN [databricks.sdk] azure_workspace_resource_id field not provided. It is recommended to specify this field in the Databricks configuration to avoid authentication errors. 18:10:33 INFO [databricks.sdk] Using Azure CLI authentication with AAD tokens 18:10:33 WARN [databricks.sdk] azure_workspace_resource_id field not provided. It is recommended to specify this field in the Databricks configuration to avoid authentication errors. 18:10:37 WARN [d.l.u.azure.locations] Skip unsupported location: wasbs://databricks-workspace@.blob.core.windows.net/external.db/ 18:10:37 WARN [d.l.u.azure.locations] Skip unsupported location: wasbs://db-data@.blob.core.windows.net/external_mnt.db/ 18:10:49 INFO [d.l.u.assessment.crawlers] Checking in subscription 18:10:53 ERROR [src/databricks/labs/ucx.migrate-locations] InvalidParameterValue: CreateExternalLocation name "databricks-workspace_ucxsimulationstorage_external_abfss.db" is not a valid name

Expected Behavior

dot should be replaced with _ to avoid failures

Steps To Reproduce

No response

Cloud

Azure

Operating System

Windows

Version

latest via Databricks CLI

Relevant log output

NA
JCZuurmond commented 3 months ago

Hi @SomanathSankara, the period . in the name is not the problem. The problem is that ucx does not support migrating wasb locations, see here.

Note that Microsoft deprecated its wasb driver for Azure storage:

Microsoft has deprecated the Windows Azure Storage Blob driver (WASB) for Azure Blob Storage in favor of the Azure Blob Filesystem driver (ABFS); see Connect to Azure Data Lake Storage Gen2 and Blob Storage. ABFS has numerous benefits over WASB; see Azure documentation on ABFS.

Unity catalog needs an ADLS gen2 storage account for external locations:

You can create an external location that references storage in an Azure Data Lake Storage Gen2 storage container or Cloudflare R2 bucket.

To migrate, there are two steps to verify/perform

  1. The storage account should be ADLS gen2. If it is not already, it should be created and the data should be copied.
  2. Update the existing external location to use the ABFSS driver instead of WASB

Alternatively, you could not migrate all the locations and (try to) migrate the tables by remapping tables with a WASB location to a destination that is not using a WASB location. If that does not work, please open an issue for it