starburstdata / dbt-trino

The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)
Apache License 2.0
212 stars 55 forks source link

Change full-refresh behaviour to drop target table after successful run #327

Open lutzkuen opened 1 year ago

lutzkuen commented 1 year ago

Describe the feature

Right now when you run a full-refresh with dbt-trino it will drop the target table and attempt to re-create it. We have adopted a practice of manually taking backups before doing full-refresh of large tables. If users try to access the data while it is refreshing they get an error. My proposal is to create the new table with the __dbt_tmp suffix and only after successful creation drop the target table and rename the __dbt_tmp table. This way data stays available to users and no manual rollback is needed if the full refresh does not succeed.

Describe alternatives you've considered

Manual workaround is easy but annoying

Who will benefit?

Users and Developers will benefit as there is less margin for error and better data availability

Are you willing to submit PR?

hovaesco commented 1 year ago

Hi @lutzkuen, what is the target connector you're using? We are working on adding support for CREATE OR REPLACE in the engine which could help with the issue and it could replace creating temp tables.

lutzkuen commented 1 year ago

We are using dbt-trino with both starburst and Trino on Iceberg. When create or replace is going to be supported by the engine that would make a much easier solution