dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
10.02k stars 1.64k forks source link

[Bug] Calling a macro in a pre- or post-hook #7128

Closed seub closed 2 weeks ago

seub commented 1 year ago

Is this a new bug in dbt-core?

Current Behavior

There's 3 ways to define a pre-hook for a model:

  1. In dbt_project.yml
  2. In a property file (properties.yml)
  3. In a config block (my_model.sql)

Calling a macro in a pre-hook works for 1. and 3. but not 2:

# properties.yml
version: 2
models:
  - name: my_model
    config:
      pre-hook: "{{ my_macro() }}"

produces a compilation error

Could not render {{ my_macro() }}: 'my_macro' is undefined

Expected Behavior

The behavior should be consistent across 1. 2. 3. Ideally, calling a macro for a pre-hook should work in property files too.

Steps To Reproduce

See Current Behavior

Relevant log output

No response

Environment

- OS:
- Python:
- dbt: 1.4.3

Which database adapter are you using with dbt?

snowflake

Additional Context

https://getdbt.slack.com/archives/C50NEBJGG/p1677527674055049

dbeatty10 commented 1 year ago

Thanks for noticing this scenario, instigating a discussion in Slack, and opening this issue @seub 🏆 !

I agree with you and @jtcohen6 that it should align with the expected behavior and work across all three scenarios.

dbeatty10 commented 1 year ago

Acceptance criteria

  1. A new test is added that verifies macros within pre- and post-hooks are late-rendered when they are specified within a model properties YAML configuration file. Use YAML similar to that described in the "Current Behavior" above.
  2. This test fails prior to implementing the feature.
  3. This test succeeds after implementing this feature for pre-hook, post-hook, pre_hook, and post_hook.
  4. Ensure that late-rendering tests exist for dbt_project.yml and model config blocks also.
github-actions[bot] commented 1 year ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 1 year ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

d-musambi commented 1 year ago

This would be very beneficial for us as we maintain a multi-tenant architecture and would allow us to have dbt docs generate for our models that run across tenants

siljamardla commented 1 year ago

The issue is still relevant, still experiencing inconsistent behaviour. We have just recently decided that, for clarity, we will keep all of our configurations in the configuration files so they would not be scattered between configuration files and model files. This issue is making it impossible to keep all the configurations in configuration files :(

occulkot commented 1 year ago

Ive just bump into this issue myself when we started using post-hook in our dbt project

adrivn commented 1 year ago

Same. I refactored all the individual schema.yml files along with the SQL models, removing the individual hooks, to unify them into a single properties.yml file, but seems like the pre-post-hooks aren't allowed to run. Is anyone looking into this inconsistency?

dbeatty10 commented 10 months ago

Multiple people have reported the desire to put all configs within properties.yml YAML files (rather than having some configs within individual model files).

Resolving this issue will provide that capability. In the meantime, a workaround is provided below along with a reproducible example (reprex).

Reprex

This works when running dbt compile, dbt run, dbt build, etc:

models/my_model.sql

{{ config(
    pre_hook="{{ some_macro() }}"
) }}

select 1 as id

And so does this:

dbt_project.yml

name: my_project
profile: my_profile

models:
  my_project:
    +pre-hook: "{{ some_macro() }}"

But this doesn't:

models/_properties.yml

models:
  - name: my_model
    config:
      pre_hook: "{{ some_macro() }}"

And it gives this error message:

14:31:46  Encountered an error:
Compilation Error
  Could not render {{ some_macro() }}: 'some_macro' is undefined

Workaround

In the meantime, the workaround is to put hook configs within model files instead of properties.yml / schema.yml:

{{ config(
    pre_hook="{{ some_macro() }}"
) }}

...
rbs392 commented 9 months ago

@dbeatty10 What would you suggest for python models?

dbeatty10 commented 9 months ago

@rbs392 See below for something you could try out for dbt python models. I didn't try it out personally, so let me know if it works or not 🙏

Let's suppose you have a python model at the path models/subfolder_1/subfolder_2/my_python_model.sql. Then you could try this within dbt_project.yml:

name: my_project
profile: my_profile

models:
  my_project:
    subfolder_1:
      subfolder_2:
        my_python_model:
          +pre-hook: "{{ some_macro() }}"
rbs392 commented 9 months ago

you could try out for dbt python models. I didn't try it out personally, so let me know if it works or not 🙏

Yeah this seems to work 😄 Would be nice to get it working with schema.yml as well

dbeatty10 commented 9 months ago

We're not able to prioritize this ourselves right now, so labeling this as "help wanted".

toandm commented 7 months ago

This is also an issue when I try to apply macro to post-hook within seed property file. Since you cannot add config directly to seed file, the only workaround is to use dbt_project.yml. Would love to have this fixed so it would help trim down the already bloated dbt_project.yml file.

dbeatty10 commented 4 months ago

I ran into this myself this week.

The solution would be late-rendering hooks that appear within properties.yml files (like https://github.com/dbt-labs/dbt-core/pull/6435 added late-rendering for pre_hook and post_hook in dbt_project.yml).

The relevant code appears to be should_render_keypath / _is_norender_key within the SchemaYamlRenderer class.

The analogous code for dbt_project.yml is should_render_keypath within the DbtProjectYamlRenderer class.

dbeatty10 commented 3 months ago

To expand on the use-case of adding a post hook to a seed in a properties YAML file, here's an example:

seeds:
  - name: seed_name
    config:
      post_hook: "alter table {{ this }} alter column id set not null"
    columns:
      - name: id

But when running dbt seed, I got an error like this because {{ this }} wasn't late-rendered:

13:43:24    Database Error in seed seed_name (seeds/seed_name.csv)
  syntax error at or near "COLUMN"
  LINE 3:         alter table  alter column id set not null

If instead this worked, then this would be a useful alternative to add to https://github.com/dbt-labs/dbt-core/issues/10551 for adding database constraints to a seed (or snapshot).

lucidviews commented 2 weeks ago

Hi team,

this bug seems to be back for YAML-defined snapshots. I'm running this in dbt Cloud on versionless and I'm getting the following server error for any macro that I call as either post- or pre-hook:

Compilation Error
  Could not render {{test()}}: 'test' is undefined

This is my snapshot definition in a Jaffle shop project:

snapshots:
  - name: stg_customers_snapshot
    relation: ref('stg_customers')
    config:
      database: ANALYTICS
      schema: DBT_LSILBERNAGEL
      unique_key: customer_id
      strategy: check
      check_cols: ['customer_name']
      post-hook:
        - '{{test()}}'

The macro is defined as:

{% macro test() -%}
    {{ return('SELECT 1') }}
{%- endmacro %}

Lmk if you need more info.

lucidviews commented 2 weeks ago

UPDATE: The above behaviour is due to the underscore vs. hyphen spelling of the hook. Related issue: https://github.com/dbt-labs/dbt-core/issues/10965

dbeatty10 commented 2 weeks ago

@lucidviews Thanks for connecting this to https://github.com/dbt-labs/dbt-core/issues/10965. That issue seems to precisely describe what you are observing for post-hook (hyphen) vs post_hook (underscore) for snapshots. The former always works in dbt_project.yml files, and the latter always works within properties/schema YAML files (schema.yml, properties.yml, etc.)