Closed epapineau closed 5 days ago
Heya @epapineau 👋
This looks the same to me as needing to force dependencies like this:
-- depends_on: {{ ref('some_model') }}
{{ generate_model_sql() }}
If you don't do the -- depends_on: {{ ref('some_model') }}
bit, does your project display the expected lineage graph and run in the correct order?
@dbeatty10 thanks for the suggestion! The issue is not lineage or ordering - that is working as expected. However, after adding the suggested --depends_on
and executing dbt parse
, the ref'd models are still not present in the depends_on
object for the macro in manifest.json
.
After touching base with @gshank, I can confirm that it's known behavior that we don't track model ref
s when they are in a macro. So I'm going to update this to be a feature request.
Correct!
Conceptually, this is a bit similar to "vars referenced in macros are resolved based on the scope (project) of the model calling the macro, rather than the scope (package) where the macro is defined":
@jtcohen6 interesting. Is there an alternative recommendation to understand macro dependencies on models from manifest.json
?
@epapineau Thinking aloud - could you add an ephemeral model which calls the macro, and then look at the ephemeral model's depends_on.nodes
in manifest.json
?
That use case would work for one or two macros, but would quickly become unfeasible at the scale of 100+ macros 🤔
Fair! Okay, taking a step back — why not track model dependencies within macros?
Conceptually, macros are resources in your project, but they aren't actually nodes in the DAG. They're functions that return code, which can call (and depend on) other functions that return code — whereas depends_on.nodes
describes actual edges within the DAG. A model is only directly depended on by another DAG node, not by the function called within that DAG node which returns code.
You're allowed to define a macro that references a model that doesn't exist:
{%- macro generate_bad_model_sql() -%}
select
column_name_a,
column_name_b
from {{ ref('does_not_exist') }}
{%- endmacro -%}
Until you call that macro within an actual model (DAG node), it doesn't pose a problem for constructing a viable DAG. As soon as you do, you'll see a Compilation Error
.
It's also tricky to think about: How would we track model → macro dependencies when that macro takes an argument, to modify the reference based on some input or parse-time condition?
{%- macro generate_model_sql(model_name='some_model') -%}
select
column_name_a,
column_name_b
from {{ ref(model_name) }}
{%- endmacro -%}
I do see the value in tracking which macros depend on ref
within their depends_on.macros
. I don't think we support that for builtin ref right now (it's not a macro per se), but it does work for custom ref overrides. Not actually recommending this, just to illustrate how you might be able to programmatically determine which macros / how many macros directly reference ref
:
{%- macro generate_model_sql() -%}
select
column_name_a,
column_name_b
from {{ my_project.ref('some_model') }}
{%- endmacro -%}
{%- macro custom_ref() -%}
{{ return(builtins.ref(*varargs, **kwargs)) }}
{%- endmacro -%}
Then in manifest.json:
"macro.my_project.generate_model_sql": {
"name": "generate_model_sql",
"resource_type": "macro",
...
"depends_on": {
"macros": [
"macro.my_project.ref"
]
},
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.
Is this a new bug in dbt-core?
Current Behavior
I am not 100% sure this is a 🐛, but it doesn't meet my assumptions of what I'd find in
manifest.json
:If I have a macro, defined as:
I would expect
macros.macro.project.generate_model_sql.depends_on
to contain:{'node': ['model.project.generate_model_sql']}
. However, AFAICT, thisdepends_on
only contains metadata about referenced macros and not referenced nodes / models.Expected Behavior
Ref'd models to be included in the
depends_on
objectSteps To Reproduce
ref
dbt parse
depends_on
value inmanifest.json
Relevant log output
No response
Environment
Which database adapter are you using with dbt?
bigquery
Additional Context
No response