Closed jeremyyeo closed 6 months ago
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.
Is this your first time submitting a feature request?
Describe the feature
Unlike other adapter, the dbt-bigquery adapter is special in that some operations are not performed via a SQL statement but rather via calling some method from the
google-cloud-bigquery
library.For example, the dropping of an existing relation in this adapter calls upon the
delete_table
method:https://github.com/dbt-labs/dbt-bigquery/blob/27ade3df84b040d4140c1870a033b20c2495b5cd/dbt/include/bigquery/macros/relations/drop.sql#L1-L3
https://github.com/dbt-labs/dbt-bigquery/blob/27ade3df84b040d4140c1870a033b20c2495b5cd/dbt/adapters/bigquery/impl.py#L232-L240
Snowflake (Redshift + Postgres as well) will issue straightforward DDL:
https://github.com/dbt-labs/dbt-snowflake/blob/82b2acbf36099ea28a22032e98e95dd888e90bbd/dbt/include/snowflake/macros/relations/drop.sql
https://github.com/dbt-labs/dbt-core/blob/d597b80486ebb409685fafb23d972f5361ddf7fc/core/dbt/include/global_project/macros/relations/table/drop.sql
https://github.com/dbt-labs/dbt-redshift/blob/08625fe99db3bcaac94702719bda42ce168ed0ec/dbt/include/redshift/macros/relations/table/drop.sql
The outcome of this is that it get's quite a bit harder to do debugging. Recently, I was helping a user to debug a potential duplicate in their seeds - i.e. seeds were simply being "inserted into" without first being dropped.
Let's look at the debug logs for a seed that already exist:
^ There's basically no indicator as to whether the seed that already exist was dropped prior to it being recreated or not.
Let's look at the project history from the BQ UI:
There's a
LOAD
operation - which presumably is the insertion of the data into table. And then the subsequentQUERY
operation which is thealter table ... set OPTIONS()
statement.It turns out if we go to GCP Logging:
We do see the delete operation there.
I do think this is a bit cumbersome to debug though - when we compare it to doing the same thing in Snowflake.
^ From the one spot (debug logs) - we can see exactly all the DDL involved with the seed.
Describe alternatives you've considered
I suppose a user could rewrite their own:
Who will this benefit?
Folks who want to debug straight from the debug logs. By executing straight SQL
drop/truncate table ...
it becomes more obvious all the operations that dbt is performing (across all adapters - at least the "major" ones) for any type of node. Rather than having specific knowledge that you'd have to go digging into specialised tools (would perhaps consider GCP Logging one).Are you interested in contributing this feature?
Sure
Anything else?
This is kind of the second time I've thought about this - the first time was in https://github.com/dbt-labs/dbt-bigquery/issues/886 where I wasn't actually seeing the
drop table ...
statements when looking a customers debug log - but that's because those drop operations were not actually invoked via SQL statments.Top of mind is just this "relation dropping" operation but there may be others - have not looked too deeply yet.