fivetran / dbt_klaviyo_source

https://fivetran.github.io/dbt_klaviyo_source/
Apache License 2.0
2 stars 5 forks source link

New Feature: Unioning source data #8

Closed pawelngei closed 3 years ago

pawelngei commented 3 years ago

Are you a current Fivetran customer? Yes, Pawel Ngei, Pattern Brands

What change(s) does this PR introduce?

Does this PR introduce a breaking change?

If I understand the setup correctly, there should be absolutely no changes for a single-database, single-schema setup.

Is this PR in response to a previously created Issue

How did you test the PR changes?

Tested on our private Pattern Brands project.

Select which warehouse(s) were used to test the PR

Provide an emoji that best describes your current mood :bulb:

Feedback

It'd be great to get some better testing instructions without the CircleCI ;)

pawelngei commented 3 years ago

Ah, it turns out I misconfigured my project and this PR doesn't really work. For some reason stg_klaviyo__event_tmp.sql compiles from:

{{
    fivetran_utils.union_data(
        table_identifier='event_table', 
        database_variable='klaviyo_database', 
        schema_variable='klaviyo_schema', 
        default_database=target.database,
        default_schema='klaviyo',
        default_variable='event_table'
    )
}}

into:

  create or replace  view ANALYTICS.dbt_pawel_stg_klaviyo.stg_klaviyo__event_tmp  as (

  );

Any idea what might cause it?

EDIT:

pawelngei commented 3 years ago

The previous bug was caused by the wrong name of the table_identifier for event, it should be event instead of event_table. Now the problem is the _dbt_source_relation which despite being in the latest dbt-utils seems to be a problem for SQL:

Database Error in model stg_klaviyo__event (models/stg_klaviyo__event.sql)
  000904 (42000): SQL compilation error: error line 173 at position 31
  invalid identifier '_DBT_SOURCE_RELATION'
  compiled SQL at target/run/klaviyo_source/models/stg_klaviyo__event.sql

The SQL in question:

    when lower(replace(replace(_dbt_source_relation,'"',''),'`','')) like '%.klaviyo_gir.%' then 'klaviyo_gir'

    when lower(replace(replace(_dbt_source_relation,'"',''),'`','')) like '%.klaviyo_pb.%' then 'klaviyo_pb'

Our Klaviyo datasets have quite a big difference in the number of custom columns. Could this be the problem, not encountered in Shopify before?

I'll need to wrap up for today, will be back to debugging this PR tomorrow!

fivetran-joemarkiewicz commented 3 years ago

Thanks so much @pawelngei let me know if you encounter any other questions! 🎉 💯

pawelngei commented 3 years ago

Addressed all the problems. The above bug stemmed from the stg_klaviyo__event using more intermediary helper tables than the shopify version of this plugin. The _dbt_source_relation was available in the base table, but not in the fields we were pulling from. I created add_dbt_source_relation macro which adds the field to the fields database if we're querying from multiple sources.

Tested on both a single source and multiple datasets.

Question to you @fivetran-joemarkiewicz : should we keep the add_dbt_source_relation macro, or rather modify fivetran_utils.fill_staging_columns (this might affect other plugins which don't use intermediary tables) or get_columns_in_relation?

My solution repeats the function call per-table, but might be the cleanest if we don't want to refactor the fields table.

Should be good to test with your internal CI / merge.

fivetran-jamie commented 3 years ago

hey FYI testing this out now and will approve/merge tomorrow - thanks @pawelngei!

fivetran-jamie commented 3 years ago

decided it would be best to house add_dbt_source_relation in our fivetran-utils package, so merging this PR into a working branch so i can handle that 👍