dbt-labs / dbt-external-tables

dbt macros to stage external sources
https://hub.getdbt.com/dbt-labs/dbt_external_tables/latest/
Apache License 2.0
314 stars 123 forks source link

Handle BigQuery non-string option 'max_staleness' #237

Closed marcbllv closed 8 months ago

marcbllv commented 1 year ago

Description & motivation

resolves: #231

Option item max_staleness in BigQuery must be passed as an INTERVAL, not a string. But it's flagged as a string in YAML config.

Therefore compiled SQL code is:

create or replace external table `... my_table_name`
options (
    max_staleness = 'INTERVAL 1 HOUR'  -- <-- Quotes makes BigQuery angry here
)

while it should be:

create or replace external table `... my_table_name`
options (
    max_staleness = INTERVAL 1 HOUR  -- <-- No quote, BigQuery's happy
)

Checklist

cbini commented 1 year ago

just ran into this myself! hoping this can get merged into a release soon

marcbllv commented 1 year ago

Thank you @cbini! Do you know when it could be merged and a new version released?

cbini commented 1 year ago

@marcbllv nope! I'm just another user who's having the same problem. Wanted to bump this to call attention to it.

thomas-vl commented 9 months ago

@marcbllv I think a cleaner solution is just to remove the check if its a string and do the quotes, if you need to pass a string into the options do so explicitly by setting the quotes in the YAML configuration. This solution adds more complexity instead of reducing it.

marcbllv commented 9 months ago

@thomas-vl thanks for your answer! I agree on the added complexity of the code, but IMO it's fine since it decreases complexity on YAML configuration for users: you can't write INTERVAL types in yaml, so let's have users simply write strings and dbt-external-tables does the conversion (by adding quotes or not depending on the type on BigQuery) -> no need to worry about quoting or not in yaml.

Also removing both the check and quotes will break backwards compatibility: every user will have to explicitly write quotes in yaml (and it's super painful since you need to escape quotes).

IMO the added complexity is ok: it's not that huge, and allow users to do something that can't be done otherwise! WDYT?

cbini commented 9 months ago

I think another thing you could do is add max_staleness to the if in the for loop like uris.

edit: nope! that'll just skip it entirely

thomas-vl commented 9 months ago

@marcbllv Ok makes sense, we have an open discussion anyway to change the way how to set BigQuery options.

thomas-vl commented 9 months ago

@jeremyyeo much needed feature, lets merge it 🥇

thomas-vl commented 8 months ago

@dataders I saw you merge another pr, could you also take a look at this one?