databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
213 stars 114 forks source link

DBT failing to execute CREATE TABLE command because it assigned a column DEFAULT value #758

Open shloktech opened 1 month ago

shloktech commented 1 month ago

Created from: https://github.com/dbt-labs/dbt-spark/issues/1085

Current Behavior

I am using DBT for Data Testing. Over here I have a few not_null checks on delta tables which are in databricks and I am running them by using the below command: dbt build --profile {our_data_testing_profile} -t {our_env} --select "models/models_to_test+" --store-failures| tee dbt_issues.log || true

Here I am storing the output of data testing into delta tables in databricks which are under dbt_test__audit schema. While doing it we are getting the below error stating:

Runtime Error in test our_internal_test_name WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED] Failed to execute CREATE TABLE command because it assigned a column DEFAULT value, but the corresponding table feature was not enabled. Please retry the command again after executing ALTER TABLE tableName SET
 TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported')

On debugging I found that it dbt is using the below query to create table in dbt_test__audit schema from dbt.log file: image

This is occurring because databricks does not let us create a table which is based on an empty dataframe. Output of the select query: image

To resolve in query we need to add the below property in the above query which is suggested by databricks as well in the error message. After this the query works as per expectations: image

Even after creating the above table when I rerun the dbt test then it fails because it uses the query in first screenshot above which does not have the TBLPROPERTIES command in it. I searched in dbt documentations but did not find any option that let's me add table properties. I I run only the dbt build query then the data testing works without any error but when I run it with commands to store the results then it fails due to above TBLPROPERTIES not present in the create table query generated by dbt-core.

Expected Behavior

The expected behavior can be divided in two parts: 1) DBT should add this table property by default in the create table query for databricks as a source. 2) There should be an option to configure table properties when we are storing the failures.

Steps To Reproduce

  1. Configure a not null check on a delta table in databricks which will pass.

  2. Run the below command: dbt build --profile {your_data_testing_profile} -t {your_environment} --select "models/models_to_test+" --store-failures| tee dbt_issues.log || true

  3. Make sure that the target for storing the results is databricks. Secondly the source table which is undergoing data test has a column which has default value assigned to it like below: image

  4. You will get the below error for your test case:

    Runtime Error in test our_internal_test_name WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED] Failed to execute CREATE TABLE command because it assigned a column DEFAULT value, but the corresponding table feature was not enabled. Please retry the command again after executing ALTER TABLE tableName SET
    TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported')

Relevant log output

No response

Environment

- OS: Ubuntu 22.04.4 LTS
- Python: 3.8.18
- dbt-core:1.8.4

Which database adapter are you using with dbt?

other (mention it in "Additional Context")

Additional Context

The adapter used is Databricks

benc-db commented 1 month ago

https://docs.getdbt.com/reference/resource-configs/databricks-configs#configuring-tables You can specify tblproperties in your model config