dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
298 stars 120 forks source link

Catalog / Schema quoting issue #297

Open samuhepp opened 5 months ago

samuhepp commented 5 months ago

Describe the bug

Hello!

Having been playing around with this, there looks to be an issue with the way this package constructs the naming - specifically with the Databricks adapter.

Given the following sample Yaml

version: 2

sources:
  - name: my-source
    catalog: my-catalog
    schema: my-schema
    tables:
      - name: table1
        external:
          ...
        columns:
          ...

Within databricks, you have to quote any name that doesn't contain alphanumeric and undescores using backticks "`". This doesn't happen here when I run dbt run-operation stage_external_sources. The error:

09:10:11  Encountered an error while running operation: Runtime Error
  Runtime Error

    [INVALID_IDENTIFIER] The identifier dataplatform-sandbox is invalid. Please, consider quoting it with back-quotes as `my-catalog`. SQLSTATE: 42602 (line 3, pos 57)

    == SQL ==
    /* {"app": "dbt", "dbt_version": "1.8.0b2", "dbt_databricks_version": "1.8.0b2", "databricks_sql_connector_version": "3.1.2", "profile_name": "dev", "target_name": "dev", "connection_name": "macro_stage_external_sources"} */

                     create schema if not exists my-catalog.my-schema
    -----------------------------------------------^^^

(It's complaining about the Hyphen)

I've attempted to quote this myself via the following config:

version: 2

sources:
  - name: my-source
    catalog: "`my-catalog`"
    schema: "`my-schema`"
    tables:
      - name: table1
        external:
          ...
        columns:
          ...

which seems to get past this issue, but then fails later down the line because all identifiers are being quoted twice.

    [PARSE_SYNTAX_ERROR] Syntax error at or near 'my'. SQLSTATE: 42601 (line 5, pos 31)

    == SQL ==
    /* {"app": "dbt", "dbt_version": "1.8.0b2", "dbt_databricks_version": "1.8.0b2", "databricks_sql_connector_version": "3.1.2", "profile_name": "dev", "target_name": "dev", "connection_name": "macro_stage_external_sources"} */

            drop table if exists ``my-catalog``.``my-schema``.`raw_adis`
    -------------------------------^^^

(note the double quotes)

Steps to reproduce

Expected results

Creation of schema if not exists should be quoted

Actual results

Fails when running dbt run-operation stage_external_sources

Screenshots and log output

System information

The contents of your packages.yml file:

Which database are you using dbt with?

The output of dbt --version:

Core:
  - installed: 1.8.0-b2
  - latest:    1.7.13   - Ahead of latest version!

Plugins:
  - databricks: 1.8.0b2 - Ahead of latest version!
  - spark:      1.8.0b2 - Ahead of latest version!

The operating system you're using:

The output of python --version:

Python 3.10.12

Additional context