LewisDavies / upstream-prod

A dbt package for easily using production data in a development environment.
32 stars 4 forks source link

Env schemas don't seem to work with snapshots that use the `target_schema` config. #28

Closed jack-cook-repo closed 4 months ago

jack-cook-repo commented 8 months ago

I use env schemas, and implemented this package by updating the generate_schema_name macro. But, it seemed to not work in models where I was calling ref against a snapshot.

This is how my variables were set up in dbt_project.yml:

  upstream_prod_env_schemas: true
  upstream_prod_enabled: true
  upstream_prod_disabled_targets:
    - production
  upstream_prod_prefer_recent: true

This is how my macro was set up:

{% macro generate_schema_name(custom_schema_name, node, is_upstream_prod=False) -%}

    {%- set default_schema = target.schema -%}

    {# Dev, and it's a selected node - don't split out schemas #}
    {%- if target.name != "production" and is_upstream_prod == False -%}
        {{ default_schema }}

    {# Production, or dev when it's not a selected node - split out schemas #}
    {%- else -%}
        {# Tell upstream_prod to use our production schema #}
        {%- if is_upstream_prod -%}
            {%- set default_schema = 'dbt_production' -%}
        {%- endif -%}

        {# Break out separate schemas #}
        {%- if custom_schema_name is none -%}
            {{ default_schema }}
        {%- else -%}
            {{ default_schema }}_{{ custom_schema_name | trim }}
        {%- endif -%}

    {%- endif -%}

{%- endmacro %}

All of our snapshots have this in the config block at the top:

{{
    config(
        target_schema='snapshots',
        ...,
        ...,
)
}}

When I tried to dbt run a model that had a ref() pointing at a snapshot, raise_ref_not_found_error raised this:

image

From inspecting it, it looked like it was resolving {{ ref('snapshot__...') }} to my_project.dbt_production.snapshot__... rather than resolving it to the config value of target_schema. I needed the ref to resolve to my_project.snapshots.snapshot__....

As a hacky workaround, I added this clause to the top of my generate_schema_name macro:

    {# Override to get snapshots working #}
    {% if node.config.target_schema == 'snapshots' %}
        snapshots

    {# Dev, and it's a selected node - don't split out schemas #}
    {%- elif target.name != "production" and is_upstream_prod == False -%}

This got everything working again. I'm not exactly sure why my new configuration including is_upstream_prod broke the step in dbt core where it forces the schema to be written to target_schema.

LewisDavies commented 4 months ago

Thanks for opening this issue @jack-cook-repo! Sorry for the late reply but I've only just found the time to fix this properly. Really appreciate all the extra detail you included, it helped me to hone in on the problem much quicker 🙌

I'll try open a PR and release a new version of the package tomorrow. You're welcome to upgrade and try it or continue as-is with your workaround.