fivetran / dbt_jira

Data models for Fivetran's Jira connector built using dbt.
https://fivetran.github.io/dbt_jira/
Apache License 2.0
8 stars 13 forks source link

Bugfix/updated issues field tracking #104

Closed fivetran-joemarkiewicz closed 1 year ago

fivetran-joemarkiewicz commented 1 year ago

PR Overview

This PR will address the following Issue/Feature: Issue #100

This PR will result in the following new package version: v0.14.0

While this is not technically a breaking change, the package users will benefit from a full refresh and it will also be batched with the PR #103 changes which are breaking.

Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:

The main change within this PR is adjusting the int_jira__issue_calendar_spine logic to now reference the int_jira__field_history_scd model as an upstream dependency. In particular the adjustment modifies the open_until field within the int_jira__issue_calendar_spine model to be dependent on the int_jira__field_history_scd model's valid_starting_on column as opposed to the issue table's updated_at field.

This is required as some resolved issues (outside of the 30 day or jira_issue_history_buffer variable window) were having faulty incremental loads due to untracked fields (fields not tracked via the issue_field_history_columns variable or other fields not identified in the history tables such as Links, Comments, etc.) causing the updated_at column to update, but there were no tracked fields that were updated. Thus causing a faulty incremental load.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

Before marking this PR as "ready for review" the following have been applied:

Detailed Validation

Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":

To validate this change I first wanted to recreate the issue with the current version of the package. As identified the issue occurs when a resolved ticket (outside of the 30 day or variable window) has an untracked change (ie. comments) updated after the resolution of the issue. Let's jump in to recreate the issue.

Let's first take a ticket that has been resolved and is clearly outside the window. In this case I will use issue_id 10002.

image

Now let's run the package and see what the result of jira__daily_issue_field_history for this issue (sorting to the latest day the issue was tracked). Looking at the below screenshot we can see that the latest day tracked is 2020-06-09 and all looks as it should.

image

Let's now comment on this ticket and cause a new untracked change and see the result. (I will not share the Jira screenshots, but know that a comment was made on that ticket on 06/08/2023. After the comment was applied, let's run the package and see what the results show up in the jira__daily_issue_field_history model for this issue.

image

Hmmm that certainly looks off. Now let's test this on a new ticket and confirm the expected behavior on the branch for this PR. For context, the expected behavior is that if a resolved ticket is updated for an untracked field (ie comment) then we will not see a change to the jira__daily_issue_field_history results. However, if a tracked field is updated (ie. sprint) then it should pick back up starting on the day where the field change occurs. This will then continue for 30 days (or until the variable is set).

So let's look at issue 10012 (see first screenshot for details on this issue). Now let's run the model without making any changes and see the results of the final model.

image

All looks in place. So now let me make a comment on that ticket and see if the same issue from the current version of the package persists for this change. Looking at the results below, I can see that there is no change to the end model (as expected). Wahoo!

image

Okay, let's stress test this though. I would expect if I make a change to a tracked field (such as sprint) for this issue, then it should bring in the latest day and the updated sprint value. So for this issue I am going to update the sprint to dupe sprint 1 and see if that is properly tracked in the end model.

image

Say whaaaat looks like the dupe sprint 1 was properly brought into the end model. Okay, that is all fine and well but let's know make sure that a second change is properly captured. This is just to stress that the logic is properly capturing the latest value of the field. So I am going to now change the sprint value to dupe sprint 2 and see if it is brought in on the next run.

image

And with that, I would call the validation a wrap!!

Standard Updates

Please acknowledge that your PR contains the following standard updates:

dbt Docs

Please acknowledge that after the above were all completed the below were applied to your branch:

I will hold off on regenerating docs until the release branch as to now cause conflicts or rework.

If you had to summarize this PR in an emoji, which would it be?

💬