We are using merge incremental strategy in dbt and have tried the on_schema_change = "sync_all_columns" in order to handle any additions or deletions of columns.
However, due to an exception created in spark__alter_relation_add_remove_columns macro in dbt-spark, it simply throws this error whenever a column deletion is detected - Delta lake does not support dropping columns from tables although delta lake supports column deletions provided the table has required table properties.
In order to work around this problem, we have created an override macro that does the following,
Checks if the required tblproperties are set. If not, it sets them.
Creates an alter query to drop the column(s) and runs the query of column deletion is detected.
Creating this bug request to handle this error and perhaps create an override macro in dbt-databricks incrementals.
Steps To Reproduce
Create a dbt incremental model with at least 2 columns. And set the dbt incremental config - on_schema_change = "sync_all_columns"
{{
config(
materialized="incremental",
unique_key="id",
on_schema_change="sync_all_columns",
)
}}
with
sample_data as (
select 1 as id, 'name1' as name
union
select 2 as id, 'name2' as name
)
select *
from sample_data
Run the dbt model to materialize it in Databricks.
Drop a column and run the model again.
{{
config(
materialized="incremental",
unique_key="id",
on_schema_change="sync_all_columns",
)
}}
with
sample_data as (
select 1 as id
union
select 2 as id
)
select *
from sample_data
Expected behavior
When a column is deleted in an incremental model, we expect that the column is dropped in the target incremental model.
Implement a macro in dbt-databricks incrementals that overrides the default behaviour of spark__alter_relation_add_remove_columns in dbt-spark repo.
Remove the exception for Delta Lake (Line number 406 in the above repo)
Set required tblproperties. (delta.minReaderVersion: 2, delta.minWriterVersion: 5 and delta.columnMapping.mode: name)
Add an alter statement for removing columns
Screenshots and log output
Error when trying to drop a column.
Compilation Error in model test_on_schema_change (models/stage/br/test_on_schema_change.sql)
Delta Lake does not support dropping columns from tables
in macro spark__alter_relation_add_remove_columns (macros/adapters.sql)
called by macro alter_relation_add_remove_columns (macros/adapters/columns.sql)
called by macro sync_column_schemas (macros/materializations/models/incremental/on_schema_change.sql)
called by macro process_schema_changes (macros/materializations/models/incremental/on_schema_change.sql)
called by macro materialization_incremental_databricks (macros/materializations/incremental/incremental.sql)
called by model test_on_schema_change (models/stage/br/test_on_schema_change.sql)
System information
Core:
installed: 1.8.5
latest: 1.8.5 - Up to date!
Plugins:
databricks: 1.8.5 - Up to date!
spark: 1.8.0 - Up to date!
Additional context
I found a PR to fix this in dbt-spark. Not sure what the lead time is to get this updated in dbt-spark repo.
Can you submit a PR for your macro override? I was unaware this limitation has been lifted. I wonder which Databricks runtime version it's compatible with.
Describe the bug
We are using merge incremental strategy in dbt and have tried the on_schema_change = "sync_all_columns" in order to handle any additions or deletions of columns. However, due to an exception created in spark__alter_relation_add_remove_columns macro in dbt-spark, it simply throws this error whenever a column deletion is detected -
Delta lake does not support dropping columns from tables
although delta lake supports column deletions provided the table has required table properties. In order to work around this problem, we have created an override macro that does the following,Creating this bug request to handle this error and perhaps create an override macro in dbt-databricks incrementals.
Steps To Reproduce
Expected behavior
When a column is deleted in an incremental model, we expect that the column is dropped in the target incremental model. Implement a macro in dbt-databricks incrementals that overrides the default behaviour of spark__alter_relation_add_remove_columns in dbt-spark repo.
Screenshots and log output
Error when trying to drop a column. Compilation Error in model test_on_schema_change (models/stage/br/test_on_schema_change.sql) Delta Lake does not support dropping columns from tables
System information
Core:
Plugins:
Additional context
I found a PR to fix this in dbt-spark. Not sure what the lead time is to get this updated in dbt-spark repo.