fivetran / dbt_fivetran_log

Data models for Fivetran's internal log connector built using dbt.
https://fivetran.github.io/dbt_fivetran_log/
Apache License 2.0
30 stars 24 forks source link

bugfix/schema-name #87

Closed fivetran-joemarkiewicz closed 1 year ago

fivetran-joemarkiewicz commented 1 year ago

PR Overview

This PR will address the following Issue/Feature: No linked issue in GitHub. This relates to the Height ticket T-521886

This PR will result in the following new package version: v1.1.0

Since this is adding a new field schema_name to the fivetran_platform__audit_table model which is an incremental one, this will result in a breaking change and require users to run a --full-refresh when upgrading.

Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:

🚨 Feature Updates (Breaking Change) 🚨 The below change was made to an incremental model. As such, a dbt run --full-refresh will be required following an upgrade to capture the new column.

Documentation Updates

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

Before marking this PR as "ready for review" the following have been applied:

Detailed Validation

Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":

In order to validate these changes I wanted to make sure there were no changes to the row count which would validate that the addition of the new schema made no changes to the underlying source. However, when validating this I actually noticed that the changes I had made in my branch was larger than the current live version of the platform package 🤔

image

When investigating this further I found there were only a few occurrences where additional records occurred. I was able to isolate the differences to three table_name records: fivetran_audit, user, ticket.

image

When taking a deeper look I was able to see that these three tables actually existed in multiple schemas. For example, the user table exists in both the snowflake_stage_jira and the snowflake_stage_zendesk schemas. The same goes for the ticket table with Zendesk and HubSpot schemas.

image

When looking at the current version I was able to see that it looks like this difference is not fully captured and the row insert or replace sums are being incorrectly aggregated all together.

image

To further validate, I was able to prove that the Fivetran UI for these tables in fact matches what we see in the updates within this branch. For proof of this validation, please investigate the Usage tab for the snowflake_stage connector and see for these specific tables the totals in the audit_table match what we see in the UI.

Standard Updates

Please acknowledge that your PR contains the following standard updates:

dbt Docs

Please acknowledge that after the above were all completed the below were applied to your branch:

If you had to summarize this PR in an emoji, which would it be?

🌴