dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
395 stars 221 forks source link

[Bug] The tblproperties are not applied when using Python Model to create a table #982

Closed csimplestring closed 7 months ago

csimplestring commented 7 months ago

Is this a new bug in dbt-spark?

Current Behavior

When using the python model, the tblproperties in config is not applied. The final result table does not contain those table properties.

Expected Behavior

Those table properties should be applied when firstly creating table via Python model.

Steps To Reproduce

Under the models/data.yml, i set the following configurations

config:
    partition_by:
    - event_date
    tblproperties:
        delta.compatibility.symlinkFormatManifest.enabled: false
        delta.minReaderVersion: 2
        delta.minWriterVersion: 7
        delta.columnMapping.mode: name
        delta.enableIcebergCompatV2: true
        delta.universalFormat.enabledFormats: iceberg

In the models/data.py, it is

from pyspark.sql import functions as func

def model(dbt, session):

    df = dbt.source("data", "accounts")
    # do some transformation
    return df

then I run dbt run --select data

Relevant log output

no error in log output.

Environment

- OS: Linux
- Python: 3.10.7
- dbt-core: 1.7.7
- dbt-spark: 1.7.1

Additional Context

I am using the Azure Databricks with Unity Catalog, DBR is 14.3 LTS. I am using databricks-dbt directly but I think the databricks-dbt is just a thin wrapper on top of this project, so I ask help here.

Fleid commented 7 months ago

I'm closing this here as the ticket should be opened in the dbt-databricks repo: https://github.com/databricks/dbt-databricks The team there will be able to get to this faster than we will.