Open tuan-seek opened 2 years ago
I've also opened a different issue https://github.com/Tomme/dbt-athena/issues/62 This seems to do exactly that.
Who can review and merge it please?
We'd require this for a performance boost on our queries. Can it be merged?
I've tested this on my own fork, 12 parallel executions (12 batches in parallel for the same hour, distinct sets of minutes from the hour of data) and I confirm it works. If you're going to run DBT in parallel, on the same model, using different "vars" (like the batch number) then at the initial table creation you'll have 12 CTAS instead of 1 CTAS + 11 ITAS (insert-into-as-select) queries, but that's work-aroundable.
Lovely if we could get this merged in the main trunk. This feature helps the use of parallel queries on Athena and gets us down from 20m/hour to 4m/hour by running distinct sets of batches on the same partition (hourly in our case).
@tuan-seek and @Antauri I'm quite interested about this feature, if you are not aware, the community decided to fork Tomme/dbt-athena and have a more community friendly setup to changes, new fork is here: https://github.com/dbt-athena/dbt-athena, available in pip too.
Said so, could you tell me how in possible in your setup to have tmp tables with the same name?
Changes in this PR:
dbt run
concurrently for incremental model.fail_fast
mode when schema changes