dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
405 stars 227 forks source link

Fixed the behavior of the incremental schema change ignore option to properly handle the scenario when columns are dropped #980

Open case-k-git opened 9 months ago

case-k-git commented 9 months ago

resolves #

Problem

Fix the same issue that solved by databricks-dbt https://github.com/databricks/dbt-databricks/pull/580

When processing incrementally, adding new columns is ignored by the ignore setting. However, when a SQL model is modified to remove columns, it fails despite the ignore setting. This is because it attempts to query a column that does not exist in the created temp table. According to the dbt documentation, the job should be designed not to fail when ignored, so it has been corrected.

For example, in this use case, even if we remove column_2 from the SQL model, the query still attempts to include column_2 because it exists in the current table schema. However, since column_2 does not exist in the temporary table, the query fails.

The intended SQL insert statement looks like this:

insert into table `catalog_demo`.`test17060620400077328677_Incremental_strategies`.`append_delta`
      select  column_1, column_2 from `append_delta__dbt_tmp`

Dbt documentation

Similarly, if you remove a column from your incremental model, and execute a dbt run, this column will not be removed from your target table.

So this should not be happen https://docs.getdbt.com/docs/build/incremental-models#default-behavior

Solution

Checklist

cla-bot[bot] commented 9 months ago

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @case-k-git

cla-bot[bot] commented 8 months ago

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @case-k-git