fivetran / dbt_jira

Data models for Fivetran's Jira connector built using dbt.
https://fivetran.github.io/dbt_jira/
Apache License 2.0
8 stars 15 forks source link

Issue fields do not update if field updated one-month after resolved date #29

Closed curious-bryan closed 3 years ago

curious-bryan commented 3 years ago

What happened

At some point a jira user updates a custom field on an issue in the jira app after the issue has already been resolved for more than one month. For example, the user realized after the issues was resolved that a custom field was mislabeled. That updated custom field will not be updated in int_jira__field_history_scd model or its dependencies even after a full refresh. I believe the same is true for non-custom fields that get updated one month after resolution date is set.

This issue causes models in the date warehouse to no longer reflect the true state of issue field history available inside Jira app itself. And downstream BI consumers are perplexed why Jira says one thing and data warehouse says another.

Steps to Reproduce

  1. Find an issue resolved jira from more than a month ago
  2. Update custom (or any) field after issue has been resolved
  3. run dbt (even with --full refresh option)
  4. Field will not be updated

What I expected to happen

I expected the issue field to be updated even if the issue has been resolved.

Notes from Investigation

I chased the issue down to the source model that was not updating. It actually is not the model int_jira__field_history_scd but this model's dependency: int_jira__issue_calendar_spine. https://github.com/fivetran/dbt_jira/blob/master/models/intermediate/field_history/int_jira__issue_calendar_spine.sql

The model int_jira__issue_calendar_spine relates every issue to every day it was open plus a buffer of one month, in case the issue is re-opened, as this model is materialized incrementally. So for each issue, this model creates a window of time in which the issue was open. That window of time spans from issue created date to one month after resolution date.

Any field updates made during this time, will be reflected in downstream models. However, any field updates made after this window will not be updated in downstream models.

Possible Solutions

One possible solution (suggested) could be changing Line 45 on int_jira__issue_calendar_spine.sql to use update_at field from stg_jira__issues model instead of resolved_at field. This change would widen the window to one month beyond when the issue was last updated in jira.

Another possible solution could be changing Line 60 on int_jira__issue_calendar_spine.sql to a longer period of time after the resolution date or to a user-defined period of time.

fivetran-jamie commented 3 years ago

hey there sorry for the delay -- working on this rn!

fivetran-jamie commented 3 years ago

hey @curious-bryan would you mind trying out my solution on the jr_updates branch? I took your suggestion of switching resolved_at with updated_at but had to add some extra modifications since the spine model is incrementally materialized.

could you re-run after installing the package with the following configuration?

packages:
  - git: https://github.com/fivetran/dbt_jira.git
    revision: jr_updates
    warn-unpinned: false
curious-bryan commented 3 years ago

Hey @fivetran-jamie! I ran into 2 issues when testing the the jr_updates branch. I think both issues may be my fault ¯_(ツ)_/¯. Would you mind taking a look and setting me straight?

I am running dbt 0.19.1 with your github.com/fivetran/dbt_jira.git@jr_updates version and fivetran/jira_source@0.2.2 package.

Issue 1 - custom schemas

I could not successfully build all my models because of the custom schemas that were created from the model configurations in the dbt_project.yml file in dbt_jira and dbt_jira_source. When I removed that one line from each dbt_project.yml file, the models built as expected. I am not familiar yet with the custom schemas feature in dbt. Is there something simple I can do to get dbt to recognize that jira_stg__version model is located in dbt_cbryan_stg_jira schema instead of the target schema dbt_cbryan?

Issue 2 - ambiguous column name issue

For some reason, I ran into the ambiguous column name issue for jira__issue_enhanced model which is described in Issue https://github.com/fivetran/dbt_jira/issues/23. Do you know what I could be missing?

curious-bryan commented 3 years ago

Snap, @fivetran-jamie. Disregard Issue 1 in my comment above. I haven't read the docs in a while. 🤦‍♂️ I just read that you can disable custom schemas by following directions here: https://github.com/fivetran/dbt_jira#changing-the-build-schema.

fivetran-jamie commented 3 years ago

Cool! Re issue 2 — could you share which fields you’re passing to the daily issue field history model?

curious-bryan commented 3 years ago
  jira:
      issue_field_history_columns: [
        'Components'
        , 'Fix versions'
        , 'Issue Type'
        , 'Story Points'
        , 'Story point estimate'
        , 'Original Developer'
        , 'Original QA Tester'
        , 'Parent'
        , 'Developer'
      ]
fivetran-jamie commented 3 years ago

@curious-bryan ah i couldn't recreate it on my own but i'm pretty sure it's because i'm not running on snowflake! would you mind re-running on the following branch?

# packages.yml
packages:
  - git: https://github.com/fivetran/dbt_jira.git
    revision: bug/snowflake-cols
    warn-unpinned: false
fivetran-jamie commented 3 years ago

@curious-bryan friendly bump! our sprint ends tomorrow so hoping to push this out 🙂

curious-bryan commented 3 years ago

Sorry, @fivetran-jamie , for the delay. I re-ran the branch, and everything looks great from my end. Thank you!