dbt-labs / dbt-bigquery

dbt-bigquery contains all of the code required to make dbt operate on a BigQuery database.
https://github.com/dbt-labs/dbt-bigquery
Apache License 2.0
222 stars 157 forks source link

use "direct" write for non-partitioned python model materializations #1388

Closed colin-rogers-dbt closed 3 weeks ago

colin-rogers-dbt commented 3 weeks ago

resolves #1318

In order to support partitioned materializations in BQ python models we we switched from "direct" to "indirect" mode when writing model results in Dataproc back to BigQuery. As the naming implies "indirect" temporarily stages data in the provided GCS bucket. If a user has a retention policy on the bucket this will fail as the bucket won't allow Dataproc to delete these temp files as it goes.

This PR sidesteps that issue to ensure backwards compatibility with <1.7 created models by using "direct" write when a partitioned config is not provided.

Note: I have not added any new testing to cover the case where a user has set a retention policy. Ultimately I think this is an edge case we don't need to test against but we should document that a bucket retention policy cannot be used with a partitioned python model as a follow up

docs dbt-labs/docs.getdbt.com/#

Problem

Solution

Checklist