Open jelstongreen opened 3 months ago
I would have time to work on this.
It looks like currently the merge statement that is executed will always try to cast the datatypes. Even if it fails to cast the data (for example string
to int
) it will insert null
and there will be no error from the databricks side.
I would propose to check for schema changes in dbt, by using the check_for_schema_changes
macro from dbt core. It is already being used to check for schema changes in incremental models.
@benc-db should this check just happen, or would it make sense to make it configurable? Similar to incremental models there could be an on_schema_change
config option. It could be "ignore" by default to keep the current behavior, and only if it is set to "fail" check_for_schema_changes
would be executed.
@mikealfare Would love to hear your thoughts on @henlue's comment.
Without thinking about it too much, adding on_schema_change
to snapshots sounds like a good idea. We're in the process of updating snapshots to add more features in general. I would post some comments on this issue. It collects all of the snapshot work being done. I don't know if this particular feature is included, but there are certainly schema-related items on there.
Describe the bug
We noticed one table that we snapshot that when the source data changes a col from int to bigint but the downstream snapshot is not amended that the snapshot will just continuously insert the old record whenever the snapshot is run and not error that the new value cannot be inserted.
Steps To Reproduce
Snapshot a source model with a regular int column. Change the source model column to bigint and change the value in the row to a bigint value larger than max int. Run the dbt snapshot on the source table. The snapshot should have the check strategy and have the column that's been amended in that check list.
Expected behavior
A clear and concise description of what you expected to happen. If the target dtype is int for a bigint value then there should be an error rather than a silent failure.
Screenshots and log output
If applicable, add screenshots or log output to help explain your problem.
System information
The output of
dbt --version
:The operating system you're using:
The output of
python --version
:Additional context
Add any other context about the problem here.