Tomme / dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)
Apache License 2.0
140 stars 79 forks source link

CTAS with unique temporary table name #62

Open Antauri opened 2 years ago

Antauri commented 2 years ago

We're using dbt_athena in running 3 to 6 batches per hour (e.g. 6 parallel queries for the last hour, because the data is so big Athena has a few limitations/problems with it). The problem in running N parallel queries in dbt/dbt_athena is the "tmp" table used in the CTAS query with the incremental_overwrite mode.

Any ideas on how to make the "tmp" relation unique per query with a single model? Workarounds are to define "per-client" models (a table for each client) but that gets out of hand quickly.

I lost my way in the Jinja, else would've proposed a PR.

aut0clave commented 2 years ago

I thought that temp table names were already unique because they include the Athena execution ID. Am I wrong?

Antauri commented 2 years ago

If you're insert_overwrite into a target table, the temporary table crated is named [target_table]__tmp which in case of concurrent runs of dbt or dbt in dagster means that tmp table gets clobbered.