dbt-labs / dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
https://getdbt.com
Apache License 2.0
395 stars 221 forks source link

[ADAP-1019] [Bug] Table already exists, you need to drop it first in incremental models #940

Open lsabreu96 opened 10 months ago

lsabreu96 commented 10 months ago

Is this a new bug in dbt-spark?

Current Behavior

Whenever running a incremental model after a first run, dbt spark using hudi as file_format states that the table already exists and should be dropped.

Expected Behavior

Any run after the should proceed as the first

Steps To Reproduce

  1. Using dbt-spark=1.5.2
  2. Start a Kyuubi server with Hudi enabled
  3. Ran the sample model twice
    
    {{
    config(
        materialized='incremental',
        incremental_strategy='merge',
        unique_key='prim_key',
        file_format='hudi',
        location_root=<s3-path>'
    )
    }}

select 1 as prim_key


### Relevant log output

```shell
org.apache.kyuubi.KyuubiSQLException: org.apache.kyuubi.KyuubiSQLException: Error operating ExecuteStatement: org.apache.spark.sql.AnalysisException: Table teste_dbt_dw_spark.kyuubi_incremental_hudi already exists. You need to drop it first.

Environment

- OS: Ubuntu 20.04
- Python: 3.8.10
- dbt-core: 1.5.8
- dbt-spark:1.5.2

Additional Context

I'm running Kyuubi as I wasn't able to use thrift as per the docs on EMR.

Also followed some examples here, but didn't manage to get it working

lsabreu96 commented 10 months ago

It seems to be a problem when the adapter can't read all tables in the catalog. I had some Iceberg tables set in the same catalog and some errors were popping up regarding those.

After deleting the Iceberg tables, the adapter worked as expected

github-actions[bot] commented 1 month ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.