Bug: Making sure components data is properly joined onto JIRA daily issue field history and downstream models

PR Overview

This PR will address the following Issue/Feature: #83

This PR will result in the following new package version: v0.14.0

This is a breaking change as a dbt run --full-refresh is required to reflect the proper components data for customers.

Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:

The main issue customers are facing is components data was being joined in incorrectly because there would be identifiers in field_option that would have the same id as a component.

So when the join below is executed within jira__daily_issue_field_history, it returns the field option values instead of the component names for the component fields.

This PR addresses this issue by ensuring we join on components separately from the field option model. The customer must be utilizing the components source model and specifying components as a field in the issue_field_history_columns in the dbt_project.yml for generating historical data for jira__daily_issue_field_history. If neither of those conditions hold, components is then removed from jira__daily_issue_field_history to avoid any improper joins. See lines 113-116 and lines 127-130 for the modifications in this join.

This conditional is then implemented throughout the model so that there are no issues for customers who choose not to leverage components as well.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

[x] dbt compile
[x] dbt run –full-refresh
[x] dbt run
[x] dbt test
[x] dbt run –vars (if applicable)

Before marking this PR as "ready for review" the following have been applied:

[x] The appropriate issue has been linked and tagged
[x] You are assigned to the corresponding issue and this PR
[x] BuildKite integration tests are passing

Detailed Validation

Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":

[x] You have validated these changes and assure this PR will address the respective Issue/Feature.
[x] You are reasonably confident these changes will not impact any other components of this package or any dependent packages.
[x] You have provided details below around the validation steps performed to gain confidence in these changes.

Bug replication.

Look at field_id = 10019 for the field_option source model:

Then see the id = 10019 for the component source model:

Thus the component values returned downstream when the intial join occurs incorrectly returns the field_option values rather than the component names for the components field.

So we need to isolate components with its own specific logic.

PR solution and implementation
With the new join described above in the PR description, we ensure that if components exist, and are one of the fields mentioned in the issue field history table, any join from the existing table on the components field joins on the components table instead of field option and grabs the correct component names.

We made sure our seed files had the proper components information to be joined upstream

As you can see, the new version of the model returns the component value.

Looking upstream at models that flow from the daily issue field history like jira__issue_enhanced, you can see components is also being properly brought in.

We're on the right track!

What happens if components are not configured or part of field history?

Say we remove components from the issue_field_history_columns in our dbt_project.yml. That means we should only return summary and story points if those fields are present in the issue_field_history

We added summary rows to our seed files to test this (see the above seed files for the intial PR fix.

The resulting jira__daily_issue_field_history. was the following after execution.

You can see that components is now removed after being included in the initial run, but the summary field is populated with the appropriate information.

Similar behavior occurs when the variable jira_using_components is set to false.

What happens on an incremental load?

Because of the potential sizing of the jira__daily_issue_field_history model, many customers opt to do incremental loads to only load the most recent set of updated rows. We tested this by modifying the int_jira__issue_calendar_spine to filter on a specific date range so only certain components are returned up to a certain day, then running a dbt run --full-refresh.

This would then return records for this issue up to '2020-11-14'.

We then modified the date range by a day ('2020-11-15'), then executed a dbt run to see if the components loaded properly. Screenshot 2023-05-10 at 12 36 58 PM

Screenshot 2023-05-10 at 12 32 05 PM

Just to be safe, we then modified the date range filter in the calendar spine by several days (up to '2020-11-21') to see if there was any issue with multiple dates being incrementally loaded, then executed a dbt run.

All looks good!

Running the same steps when components are not being utilized achieved similar results to above--a successful load of the summary field by day, with components excluded.

Non-incremental sense check

Removing the date_day filters on int_jira__issue_calendar_spine and running dbt run --full-refresh and dbt run on jira__daily_issue_field_history yields the full table of results with the correctly populated component name if components are being leveraged.

Likewise, if components are not being leveraged, it removes components as a field while still correctly populating summary.

Standard Updates

Please acknowledge that your PR contains the following standard updates:

Package versioning has been appropriately indexed in the following locations:
- [x] indexed within dbt_project.yml
- [x] indexed within integration_tests/dbt_project.yml
[x] CHANGELOG has individual entries for each respective change in this PR
[NA] README updates have been applied (if applicable)
[NA] DECISIONLOG updates have been updated (if applicable)
[NA] Appropriate yml documentation has been added (if applicable)

dbt Docs

Please acknowledge that after the above were all completed the below were applied to your branch:

[x] docs were regenerated (unless this PR does not include any code or yml updates)

If you had to summarize this PR in an emoji, which would it be?

🐈‍⬛

fivetran / dbt_jira