Open JCZuurmond opened 5 months ago
Hey @JCZuurmond, good to hear from you!
Here's my understanding of the situation:
database, the second-level namespace
schema, and the first-level name
identifier(also configurable as
alias`)schema
and database
interchangeably for the second-level namespacecatalog
(third-level) and namespace
(second-level)I think the right next step is to support catalog
and namespace
as official aliases for database
and schema
, respectively.
_ALIASES
, as dbt-databricks
does herecatalog
and namespace
classmethods on SparkRelation
that return database
and schema
, respectivelyIs that something you'd be interested in contributing?
This issue is the root cause of this problem: https://github.com/dbt-labs/spark-utils/issues/38
This code does not work any more:
{% for database in sparklist_schemas('not_used') %} {% for table in sparklist_relations_without_caching(database[0]) %}
The value returned by spark__list_schemas() is the result of SHOW DATABASES which only contains one single column named "databaseName"
This means that relation.schema in spark__list_relations_without_caching returns an empty string which means that
show table extended in {{ relation.schema }} like '*'
causes a syntax error in SQL.
I am not sure why .schema was added in this commit #972. For my purpose just changing "relation.schema" to "relation" fixes the issue.
I do not know what other problems such a change might cause.
It seems that #972 is a breaking change.
Is this a new bug in dbt-spark?
Current Behavior
spark__list_relations_without_caching expects legacy field
relation.schema
Expected Behavior
spark__list_relations_without_caching expects
relation
Steps To Reproduce
N.A.
Relevant log output
No response
Environment
Additional Context
See Spark SQL migration guide