fivetran / dbt_fivetran_utils

Helper utils for our packages
29 stars 19 forks source link

bugfix/union-data-no-union-source #110

Closed fivetran-joemarkiewicz closed 1 year ago

fivetran-joemarkiewicz commented 1 year ago

PR Overview

This PR will address the following Issue/Feature: #107

This PR will result in the following new package version: v0.4.5

This will not be a breaking change as it will only make an adjustment to the union_data macro to ensure source connection when the macro is utilized and unioning is not needed.

Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:

Currently the union_data macro works in three scenarios:

  1. A package user is leveraging the [connector]_union_schemas variable to union multiple projects across schemas.
  2. A package user is leveraging the [connector]_union_databases variable to union multiple projects across databases.
  3. Is not trying to union multiple projects and just wants to use one schema.

The macro currently works seamlessly by connecting the source to the downstream models for the union variable use cases. However, the macro does not work to connect the source if only the one schema is being used. This is due to the "else" logic in the model not pointing to the source like the other versions of the macro.

This is not a glaring error (it will not actually cause runs to fail), but it does have an adverse affect on the generated docs. Since the relation is not leveraging the source macro, the package will generate the models with floating sources (see the pic below where I am using QuickBooks as an example). image

This is particularly not ideal because it also has an effect on Fivetran Transformations since it is not able to establish a connection to the upstream connector (as the source.yml file is utilized to create this connection). See the below pics for the connector not being identified in the Fivetran Transformations portal. In particular notice how the Connectors column shows none indicating the connector link has not been established. image image

As seen above, this means integrated scheduling will not work and it will create a poor user experience since it seems to be disconnected (although it isn't and will still succeed).

The changes in this PR make an update to the "else" conditional in the union_data macro to ensure a source connection may be established when a package user is not unioning datasets, but instead just using one. The reason this section of the macro didn't work in the previous build is because we were leveraging only the vars. This allowed us to make a connection and read data from the proper place. However, it did not actually make a link to the source. Therefore, in the updates within this PR we are explicitly leveraging the source function and using the variables as well to establish the source connection.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

Before marking this PR as “ready for review” the following have been applied:

Detailed Validation

Please acknowledge that the following validation checks have been performed prior to marking this PR as “ready for review”:

To validate these changes I performed the following:

Standard Updates

Please acknowledge that your PR contains the following standard updates:

dbt Docs

Please acknowledge that after the above were all completed the below were applied to your branch:

If you had to summarize this PR in an emoji, which would it be?

🔗
fivetran-joemarkiewicz commented 1 year ago

When deploying changes to packages that have Quickstart models enabled, is there a way we can test out a working branch in a quickstart or quickstart-like enviroment?

This is a great question and possibly something we should consider. We would likely need to work with other teams but I feel this would be helpful in ensuring their are no unseen bugs that could go unnoticed until being deployed in QuickStart.