Open AlexJoz opened 1 year ago
@AlexJoz Thanks for opening!
I don't think this has to do with SQL models vs. Python models, but rather about:
Manifest
in programmatic invocations does not trigger an implicit re-parse-- models/my_model.sql
{{ config(
conf_variable = var("conf_variable")
) }}
-- Variable from config: {{ config.conf_variable }}
-- Variable itself: {{ var("conf_variable") }}
select 1 as id
from dbt.cli.main import dbtRunner
from dbt.contracts.graph.manifest import Manifest
abc_manifest: Manifest = dbtRunner().invoke(["parse", "--vars", "conf_variable: abc"]).result
abc_runner = dbtRunner(manifest=abc_manifest)
abc_runner.invoke(["compile", "--vars", "conf_variable: xyz", "--select", "my_model"])
...
08:50:31 Compiled node 'my_model' is:
-- Variable from config: abc
-- Variable itself: xyz
select 1 as id
...
There are a few ways we could look to handle this automatically, both of them at least a little tricky:
--vars
are passed in, and differ from the vars
stored on the manifest. Basically, construct the ManifestStateCheck
and compare it to manifest.ManifestStateCheck
(the one that's been passed in), just like we do for partial parsing.var
call, instead of the resolved value itself. Then, always reevaluate the var
call given the values that have been passed in.Programmatic invocations are "full control" mode—you're in charge. So if you're going to change the value of --vars
being passed in, you always need to trigger a re-parse.
from dbt.cli.main import dbtRunner
from dbt.contracts.graph.manifest import Manifest
abc_vars = "conf_variable: abc"
abc_manifest: Manifest = dbtRunner().invoke(["parse", "--vars", abc_vars]).result
abc_runner = dbtRunner(manifest=abc_manifest)
# still need to supply --vars (not just reused from Manifest; this could be a future enhancement)
abc_runner.invoke(["compile", "--vars", abc_vars, "--select", "my_model"])
# -- Variable from config: abc
# -- Variable itself: abc
xyz_vars = "conf_variable: xyz"
# need to trigger a re-parse because --vars have changed
xyz_manifest: Manifest = dbtRunner().invoke(["parse", "--vars", xyz_vars]).result
xyz_runner = dbtRunner(manifest=xyz_manifest)
xyz_runner.invoke(["compile", "--vars", xyz_vars, "--select", "my_model"])
# -- Variable from config: xyz
# -- Variable itself: xyz
@jtcohen6
Programmatic invocations are "full control" mode—you're in charge.
Seems, I haven't dived deep enough yet))
Thanks for the quick feedback! I can confirm that it works with the proposed way of doing this.
Is this a new bug in dbt-core?
Current Behavior
Programmatic invocation overrides variables only in SQL models, while Python models continue to read values from the configuration. In contrast, a pure CLI run works as expected by overriding all the values in both models.
Expected Behavior
Both
python
andsql
models use overridden (--var
) values when invoked bydbtRunner
Steps To Reproduce
sql model:
select {{ var("conf_variable") }}
python model:
both models config:
cli invocation:
dbt run --vars '{"key": "value", "conf_variable": "1212"}'
programmatic:
Relevant log output
No response
Environment
Which database adapter are you using with dbt?
other (mention it in "Additional Context")
Additional Context
duckdb