fivetran / dbt_jira

Data models for Fivetran's Jira connector built using dbt.
https://fivetran.github.io/dbt_jira/
Apache License 2.0
8 stars 13 forks source link

[Bug] Issue Field History Incremental Load Missing Rows #125

Closed eli-reber closed 2 months ago

eli-reber commented 3 months ago

Is there an existing issue for this?

Describe the issue

The incremental load for the issue field history tables seems to be missing history for less than 1% of Jira issues. The missing records begin on the date the Jira issues was created. Updates that occur to fields that we are tracking do cause the Jira issue to then show up in the jira__daily_issue_field_history table, but there will only be records for the days after the update was made.

Note - A full refresh always resolves this problem.

Example for a Jira issue created on March 25th, and updated on March 27th. Our pipeline ran every day during this period, and there were no failures.

jira__daily_issue_field_history:

Screenshot 2024-03-31 at 7 04 32 PM

All of the intermediate tables only contain data from March 27th onward. I won't post all of them for brevity, but here is int_jira_combine_field_histories:

Screenshot 2024-03-31 at 7 20 28 PM

All of the records are contained in our data lake source that is being queried, and the correct result (all history) appears when I run the compiled query for int_jira_combine_field_histories.

Relevant error log or model output

No response

Expected behavior

It is expected that there would be a row present for each day the issue has been open.

dbt Project configurations

jira: +schema: jira intermediate: +schema: jira int_jira__issue_epic: +enabled: false jira__daily_issue_field_history: +enabled: false

jira_source: +schema: jira stg_jira__project: +enabled: false stg_jira__issue: +enabled: false

jira_issue_history_buffer: 1200 issue_field_history_columns:

Package versions

What database are you using dbt with?

snowflake

dbt Version

1.7

Additional Context

No response

Are you willing to open a PR to help address this issue?

fivetran-avinash commented 3 months ago

Hello @eli-reber ! Thank you for raising this issue with us, we were able to reproduce a version of this locally so it is definitely something we will need to account for in our models.

After some investigation, we believe that a feature that will introduce lookback windows for incremental models could potentially solve this issue. We are working on that issue this sprint, and once we have a branch in a ready state we would love for you to test it out!

We will keep you in the loop for any future developments. Let us know if you have any questions or thoughts!

eli-reber commented 3 months ago

Thank you for the update! I'd be happy to help test your branch when you're ready, just let me know.

fivetran-catfritz commented 3 months ago

Hi @eli-reber I have a working test branch that incorporates the lookback window @fivetran-avinash mentioned, which you can install using the branch below. I'll also note that the main purpose of my updates is to help performance in addition to data quality, so I'd be interested in any feedback you have from that perspective as well!

I made some materialization changes, so the first time you run this will need to be a full refresh.

  - git: https://github.com/fivetran/dbt_jira.git
    revision: feature/performance-enhancement
    warn-unpinned: false
fivetran-catfritz commented 2 months ago

Hi @eli-reber this update has been released in the latest version, which can be installed with the below snipped.

packages:
  - package: fivetran/jira
    version: [">=0.17.0", "<0.18.0"]

I am closing out this issue, but feel free to ping us in this thread with any additional comments or questions!