dbt-labs / dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)
https://dbt-athena.github.io
Apache License 2.0
228 stars 100 forks source link

When using `force_batch=true` with incremental models, it will fail if there is no data to write #519

Open parsable-alex-antonison opened 1 year ago

parsable-alex-antonison commented 1 year ago

dbt version: 1.7.0 use case: I am processing CDC data from a database into Athena Iceberg tables. issue: While I can successfully run the first load of the query, when I go to run an incremental model without any data, it will fail when it is attempting to query a table that does not exist.

  1. Attempt to create a tmp table
create table "awsdatacatalog"."databasename"."tablename__dbt_tmp__tmp_not_partitioned"
.... (query that is looking for new records in the source table from the target table. it will return nothing because no additional data has been added to the source table)
  1. Attempt to query partitions from the temp table
    select distinct partition_1, partition_2 from "awsdatacatalog"."databasename"."tablename__dbt_tmp__tmp_not_partitioned" 
    order by team_partition, team_id
  2. And this last step is where it fails
    select distinct partition_1, partition_2 from ("awsdatacatalog"."databasename"."tablename__dbt_tmp")
    order by team_partition, team_id
nicor88 commented 1 year ago

Adding a bit more context. The issue occur also for simple models like:

{{ config(
    materialized='incremental',
    incremental_strategy='merge',
    unique_key='user_id',
    table_type='iceberg',
   force_batch='true'
) }}

select 1 as user_id, 'a' as user_name

The issue seems to be in get_partition_batches and create_table_as_with_partitions

Here are some possible solutions:

@svdimchenko FYI.

antonysouthworth-halter commented 6 months ago

Oh, I didn't see this when I searched before opening #656 , will close in favour of this one.

antonysouthworth-halter commented 6 months ago

I have a fix in progress on #658