databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
226 stars 119 forks source link

'external_shallow_clone' is not a valid DatabricksRelationType #798

Closed data-blade closed 1 month ago

data-blade commented 2 months ago

Describe the bug

this is a comeback of this issue.

this is a unity-only bug, not valid for hive.

when running dbt compile while having an external shallow clone in the project, you will trigger the error from the title.

databricks executes this under the hood

if(table_type in ('EXTERNAL', 'MANAGED', 'MANAGED_SHALLOW_CLONE'), 'table', lower(table_type)) as table_type

which, if EXTERNAL_SHALLOW_CLONE is found, does not pass validation.

Steps To Reproduce

  1. use unity catalog and indicated lib versions (or lower, probably)
  2. create a shallow clone and pass location argument (to create as external)
  3. run dbt compile

Expected behavior

type EXTERNAL_SHALLOW_CLONE should pass type validation, just like MANAGED_SHALLOW_CLONE does

Screenshots and log output

Traceback (most recent call last):
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/cli/requires.py", line 138, in wrapper
    result, success = func(*args, **kwargs)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/cli/requires.py", line 101, in wrapper
    return func(*args, **kwargs)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/cli/requires.py", line 218, in wrapper
    return func(*args, **kwargs)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/cli/requires.py", line 247, in wrapper
    return func(*args, **kwargs)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/cli/requires.py", line 294, in wrapper
    return func(*args, **kwargs)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/cli/requires.py", line 332, in wrapper
    return func(*args, **kwargs)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/cli/main.py", line 343, in compile
    results = task.run()
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/task/runnable.py", line 526, in run
    result = self.execute_with_hooks(selected_uids)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/task/runnable.py", line 486, in execute_with_hooks
    self.before_run(adapter, selected_uids)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/task/runnable.py", line 474, in before_run
    self.populate_adapter_cache(adapter)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/task/runnable.py", line 464, in populate_adapter_cache
    adapter.set_relations_cache(cachable_nodes)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/adapters/base/impl.py", line 553, in set_relations_cache
    self._relations_cache_for_schemas(relation_configs, required_schemas)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/adapters/base/impl.py", line 529, in _relations_cache_for_schemas
    for relation in future.result():
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt_common/utils/executor.py", line 16, in connected
    return func(*args, **kwargs)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/site-packages/dbt/adapters/databricks/impl.py", line 254, in list_relations_without_caching
    type=self.Relation.get_relation_type(kind),
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/enum.py", line 385, in __call__
    return cls.__new__(cls, value)
  File "/Users/michaelbohm/opt/miniconda3/envs/dbtx/lib/python3.10/enum.py", line 710, in __new__
    raise ve_exc
ValueError: 'external_shallow_clone' is not a valid DatabricksRelationType

System information

The output of dbt --version:

Core:
  - installed: 1.8.6
  - latest:    1.8.6 - Up to date!

Plugins:
  - databricks: 1.8.5 - Up to date!
  - spark:      1.8.0 - Up to date!

The operating system you're using: macOS 14.6 The output of python --version: Python 3.10.14

Additional context

note: we will eventually make all our shallow clones 'managed', but for the migration period of moving from hive to unity, we'll need external shallow clones.

benc-db commented 2 months ago

Fixed in a PR merged yesterday. Will release 1.8.6 shortly with fix.