When using partitions, the function get_columns_in_relation returns additional, unwanted rows like : "Part 0", "Part 1", ...
This is a problem when using the merge_update_columns or merge_exclude_columns. The function is called by the spark__get_merge_update_columns macro and adds these irrelevant columns and then getting an error.
Steps To Reproduce
First, create minimal model using that use partitions (called test_partitions) :
{{
config(
materialized='table',
partition_by=[ 'part_col1' ],
file_format='delta',
)
}}
with incoming_data as (
select 1 as id, 'a' as part_col1
union all
select 2 as id, 'a' as part_col1
)
select * from incoming_data
Second, run it using :
dbt run -s test_partitions
Third, create a macro that logs the result of the function :
Core:
- installed: 1.8.3
- latest: 1.8.4 - Update available!
Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
Plugins:
- glue: 1.8.1 - Up to date!
- spark: 1.8.0 - Up to date!
The operating system you're using:
MacBook Pro 2,3 GHz Intel Core i9 8 cœurs
The output of python --version:
Python 3.12.4
Ideas to solve the issue
I think we should add a step similar to this one and get rid of this unwanted columns
Describe the bug
When using partitions, the function get_columns_in_relation returns additional, unwanted rows like : "Part 0", "Part 1", ...
This is a problem when using the merge_update_columns or merge_exclude_columns. The function is called by the spark__get_merge_update_columns macro and adds these irrelevant columns and then getting an error.
Steps To Reproduce
First, create minimal model using that use partitions (called test_partitions) :
Second, run it using :
Third, create a macro that logs the result of the function :
Forth, run the macro using :
You should have something similar to the following as an output :
Expected behavior
I should have only "actual" columns as an output. In the case described above, something similar to :
Screenshots and log output
See section above.
System information
The output of
dbt --version
:The operating system you're using: MacBook Pro 2,3 GHz Intel Core i9 8 cœurs The output of
python --version
:Ideas to solve the issue
I think we should add a step similar to this one and get rid of this unwanted columns