Closed fivetran-avinash closed 1 year ago
@fivetran-joemarkiewicz The above issue has been addressed to bring in component ids if there are multiple values, but hasn't been addressed to change them back into component names. Additionally, I'm seeing this issue recur for other field values, like sprints. See the below screenshot.
So, we will need to create a new feature request to handle this particular inconsistency.
For the purposes of this bug though, it does bring in the proper component values for each issue and day (even if in varying name or id form), so this should close out this particular task.
PR Overview
This PR will address the following Issue/Feature: #83
This PR will result in the following new package version:
v0.14.0
This is a breaking change as a
dbt run --full-refresh
is required to reflect the proper components data for customers.Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:
The main issue customers are facing is components data was being joined in incorrectly because there would be identifiers in
field_option
that would have the same id as acomponent
.So when the join below is executed within
jira__daily_issue_field_history
, it returns the field option values instead of the component names for the component fields.This PR addresses this issue by ensuring we join on components separately from the field option model. The customer must be utilizing the components source model and specifying
components
as a field in theissue_field_history_columns
in thedbt_project.yml
for generating historical data forjira__daily_issue_field_history
. If neither of those conditions hold, components is then removed fromjira__daily_issue_field_history
to avoid any improper joins. See lines 113-116 and lines 127-130 for the modifications in this join.This conditional is then implemented throughout the model so that there are no issues for customers who choose not to leverage components as well.
PR Checklist
Basic Validation
Please acknowledge that you have successfully performed the following commands locally:
Before marking this PR as "ready for review" the following have been applied:
Detailed Validation
Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":
Bug replication.
Look at
field_id
= 10019 for thefield_option
source model:Then see the
id
= 10019 for thecomponent
source model:Thus the component values returned downstream when the intial join occurs incorrectly returns the field_option values rather than the component names for the components field.
So we need to isolate components with its own specific logic.
PR solution and implementation
With the new join described above in the PR description, we ensure that if components exist, and are one of the fields mentioned in the issue field history table, any join from the existing table on the components field joins on the components table instead of field option and grabs the correct component names.
We made sure our seed files had the proper components information to be joined upstream
As you can see, the new version of the model returns the component value.
Looking upstream at models that flow from the daily issue field history like
jira__issue_enhanced
, you can see components is also being properly brought in.We're on the right track!
What happens if components are not configured or part of field history?
Say we remove components from the
issue_field_history_columns
in ourdbt_project.yml
. That means we should only returnsummary
andstory points
if those fields are present in theissue_field_history
We added
summary
rows to our seed files to test this (see the above seed files for the intial PR fix.The resulting
jira__daily_issue_field_history
. was the following after execution.You can see that components is now removed after being included in the initial run, but the summary field is populated with the appropriate information.
Similar behavior occurs when the variable
jira_using_components
is set to false.What happens on an incremental load?
Because of the potential sizing of the
jira__daily_issue_field_history
model, many customers opt to do incremental loads to only load the most recent set of updated rows. We tested this by modifying theint_jira__issue_calendar_spine
to filter on a specific date range so only certain components are returned up to a certain day, then running adbt run --full-refresh
.This would then return records for this issue up to '2020-11-14'.
We then modified the date range by a day ('2020-11-15'), then executed a![Screenshot 2023-05-10 at 12 36 58 PM](https://github.com/fivetran/dbt_jira/assets/108772760/297a57d3-4d3a-474f-be52-7f28c7c32f38)
dbt run
to see if the components loaded properly.Just to be safe, we then modified the date range filter in the calendar spine by several days (up to '2020-11-21') to see if there was any issue with multiple dates being incrementally loaded, then executed a
dbt run
.All looks good!
Running the same steps when components are not being utilized achieved similar results to above--a successful load of the
summary
field by day, with components excluded.Non-incremental sense check
Removing the date_day filters on
int_jira__issue_calendar_spine
and runningdbt run --full-refresh
anddbt run
onjira__daily_issue_field_history
yields the full table of results with the correctly populated component name if components are being leveraged.Likewise, if components are not being leveraged, it removes
components
as a field while still correctly populatingsummary
.Standard Updates
Please acknowledge that your PR contains the following standard updates:
dbt Docs
Please acknowledge that after the above were all completed the below were applied to your branch:
If you had to summarize this PR in an emoji, which would it be?
🐈⬛