dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
10.01k stars 1.63k forks source link

[CT-3183] [Feature] custom incremental strategy from dbt package #8769

Open sudo-pradip opened 1 year ago

sudo-pradip commented 1 year ago

Is this your first time submitting a feature request?

Describe the feature

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

Yes

Anything else?

{% macro some_macro() %}

    {% set jm = dbt_materializations.get_incremental_merge_null_safe_sql %}
    {% set temp = context.update({'get_incremental_merge_null_safe_sql': jm}) %}

{% endmacro %}
dbeatty10 commented 1 year ago

Thanks for opening this @sudo-pradip !

This is essentially one of the "outstanding questions" listed here:

  • How can users or an organization share their custom incremental strategies or overrides? Can they share in a dbt package on the dbt Package Hub for example?

Let's suppose you add a custom macro named get_incremental_merge_null_safe_sql to a dbt package named dbt_materializations. Suppose further than you install that package within a project named my_project.

Did you already try overriding global macros within dbt_project.yml like this?

dispatch:
  - macro_namespace: dbt
    search_order: ['my_project', 'dbt_materializations', 'dbt']

I haven't tried this myself, so curious if you already tried it.

sudo-pradip commented 1 year ago

Hello @dbeatty10 , Thank you for your prompt response

dbeatty10 commented 1 year ago

Finding the right macro

  • for calling a specific incremental strategy macro, get_incremental_strategy_macro uses model_context, which may not respect the order define under dispatch, feature for specifying search order is only made for adapter.dispatch method ??

You're probably onto something here!

If that area of the code doesn't respect dispatch, then the alternative you mentioned might be our only option currently:

{% macro get_incremental_merge_null_safe_sql(arg_dict) %}
    {{ dbt_materializations.get_incremental_merge_null_safe_sql(arg_dict) }}
{% endmacro %}

Do you have any ideas / opinions about re-implementation of get_incremental_strategy_macro in a different way?

Usage of dispatch

In general, you'd need a separate dispatch configs per package. So for dbt_utils and dbt_expectations you'd need two different dispatch specifications:

dispatch:
  - macro_namespace: dbt_utils
    search_order: ['my_project', 'dbt_utils', 'dbt']
  - macro_namespace: dbt_expectations
    search_order: ['my_project', 'dbt_expectations', 'dbt']

Wanna give that dispatch config a try with dbt_utils and see if it works for you?

sudo-pradip commented 1 year ago

Current

Expectation

My assumptions

Possible solutions

  1. Since model_context has all macros including from dbt packages (but not at top level), so only thing remains is to extract it, we can write same code as adapter.dispatch where first find out search_package (respect the dispatch config), a for loop, model_contex[package_name] is not None else attempts.append, end
  2. I'm not sure about reuse adapter.dispatch at get_incremental_strategy_macro, but even if we can we have to handle prefix(default, snowflake) which we may don't want to change, since many adapter (snowflake) already override get_incremental_strategy_macro macro
  3. we can use execute_macro

Solution 1 we can think, but we have to duplicate our logic at two places (i think it's all right ?), whats your input on this ?

dbeatty10 commented 1 year ago

Thanks for doing this research @sudo-pradip 🧠

I'll either need to grab an extra set of expert eyes from an engineer on the dbt-core team do some deeper research myself. Labeling this as "Refinement" accordingly.

All of us are time-constrained this week and next due to the upcoming release of dbt-core of 1.7.0 and the Coalesce conference next week, so it could be a few weeks (or more) until we can dive deeper into this.

In the meantime, your best bet is the alternative you've already considered:

  • Alternatively we can create a intermediate macro at project to call custom incremental strategy macro
    {% macro get_incremental_merge_null_safe_sql(arg_dict) %}
    {{ dbt_materializations.get_incremental_merge_null_safe_sql(arg_dict) }}
    {% endmacro %}
sudo-pradip commented 11 months ago

any updates ?

dbeatty10 commented 10 months ago

@sudo-pradip we're not able to prioritize further research at this time.

So we've opened https://github.com/dbt-labs/docs.getdbt.com/pull/4716 to document the installation steps that work currently.