fivetran / dbt_jira

Data models for Fivetran's Jira connector built using dbt.
https://fivetran.github.io/dbt_jira/
Apache License 2.0
8 stars 15 forks source link

[Bug] `Valid ending at` field is not updating all the time #64

Closed jamienewcomb closed 1 year ago

jamienewcomb commented 2 years ago

Is there an existing issue for this?

Describe the issue

Hey

I have noticed that the valid ending at field does not always update. I am trying to use this field to work out the time in status for each status as it transitions through the workflow but there are lots of nulls in the dataset. All the tables where this field is used appear to be impacted. Interesting I have no nulls for the valid starting at field

Example issue where it has not populated properly.

Screenshot 2022-09-17 at 16 52 13

Relevant error log or model output

No response

Expected behavior

I would expect as the issue transitions that the timestamps are updated but as you can see this is not happening consistently

dbt Project configurations

will need to request from data team

Package versions

will need to request from data team

What database are you using dbt with?

snowflake

dbt Version

will need to request from data team

Additional Context

No response

Are you willing to open a PR to help address this issue?

jamienewcomb commented 2 years ago

A bit more investigation

The correct values are in the issue field history table so this is a case of them not making their way over to Valid ending at in combine field histories

Screenshot 2022-09-17 at 18 23 59
fivetran-joemarkiewicz commented 2 years ago

Hi @jamienewcomb thank you for opening this issue and sharing your discovery.

Which table are you querying from in the first screenshot? It looks like you are querying from the materialized output of the int_jira__combine_field_histories model. Is that the case?

If so, I would recommend querying from the true end model jira__daily_issue_field_history as this will have the properly populated valid_starting_at and valid_ending_at. The int_jira__combine_field_histories is an intermediate model where all needed logic has not yet taken place. This could be why you are seeing a number of nulls. Those should be addressed and properly accounted for in the final model.

Let me know if this helps!

jamienewcomb commented 2 years ago

hey @fivetran-joemarkiewicz - thanks for the quick response

Correct I was querying this one https://fivetran.github.io/dbt_jira/#!/model/model.jira.int_jira__combine_field_histories

jira__daily_issue_field_history does not appear to have the valid_starting_at and valid_ending_at fields as you have suggested when I query the table. The docs also suggest these fields should not be there https://fivetran.github.io/dbt_jira/#!/model/model.jira.jira__daily_issue_field_history

Let me know if I am missing something there or if those columns need to be added to the end model table

fivetran-joemarkiewicz commented 2 years ago

Hi @jamienewcomb you are not missing anything, I misunderstood the original question you posed. Our jira__daily_issue_field_history only shows the status (in this case) changing on the daily rate. However, you are wanting to see every individual change to the issue.

If this is what you are looking for, I would recommend using the int_jira__daily_field_history intermediate model instead. This model should have the properly populated start and end times. Let me know if this helps!

jamienewcomb commented 2 years ago

No problem @fivetran-joemarkiewicz

yeah I also have missing data in that table for the valid_ending_at field. It’s a big data set and 40% of the field values are null. It looks like it completely random for if that field updates or not.

5EC1D7DE-A7E8-4C76-B6F0-8A2B295FE4C1 BDA0F860-89E6-4735-9362-EFC1B2FA80BE
fivetran-joemarkiewicz commented 2 years ago

@jamienewcomb this is very possible as these intermediate models are not intended to be used for reporting. Instead they are prepping the data along the way to then be used in our end models. I can investigate this further; however, this Bug seems to be more aligned with a Feature Request for a new end model that shows a similar behavior of our jira__daily_issue_field_history, but for every single change. This is something we can scope out and consider for a future update to our package.

In the meantime, I am curious about the table you are querying. Strangely enough, the table name seems to be different from the materialized name our package outputs (difference between int_jira__daily_field_history and your jira_daily_field_history. Are you by chance querying from a different table than the one our package outputs by chance?

jamienewcomb commented 2 years ago

@fivetran-joemarkiewicz

In the meantime, I am curious about the table you are querying. Strangely enough, the table name seems to be different from the materialized name our package outputs (difference between int_jira__daily_field_history and your jira_daily_field_history. Are you by chance querying from a different table than the one our package outputs by chance?

had a quick look into this by comparing the sources through dbt docs and it is certainly the same - It has just been named slightly different

It does feel like to me that if the field (valid ending at) is not always updating then this is a bug. I would challenge that if it's possible that this can happen at the rate it has on my dataset with 40%+ null values then we should probably just remove that field until it can be created via an end model so it does not cause any confusion (I am sceptical of this being the reason though). Lots of ifs here as this could be a me problem but we just need to confirm this on another dataset. The fact it has updated 60% cases tells me there could be something in the code which is slightly wrong which is preventing the others to update

The intention of the field is spot on in terms of what is required to pull some key agile metrics from this dataset https://fivetran.github.io/dbt_jira/#!/model/model.jira.int_jira__daily_field_history#details

Screenshot 2022-09-21 at 08 39 14
jamienewcomb commented 2 years ago

In the meantime, I have managed to hack together a fix for this myself using tableau prep builder but it would be nice to know if this is a bug for other people's benefit :)

fivetran-joemarkiewicz commented 2 years ago

@jamienewcomb I was looking through our own Jira data and found that this model was populating as expected (the valid_ending_at is only null if there is no additional change to the issue field). image

I've come to the understanding that although this model was not originally intended for end reporting, it should still have the valid_ending_at properly populated. Otherwise the end jira__daiy_issue_field_history model would not be accurate. Thank you for shedding this light on the model.

Unfortunately, I think it will be difficult to understand what error is occurring as I am unable to replicate the bug on my end. Therefore, I think it would be best for us to meet in person and debug live. If you are open to this, you can schedule time via my Calendly for us to troubleshoot and hopefully resolve.

jamienewcomb commented 2 years ago

@fivetran-joemarkiewicz Ah yeah that looks alot more complete than my model, appreciate you checking :)

I have raised a ticket in our data teams backlog to investigate this, so I will let you know what they come back with on this issue (hopefully they can get to this over the next week or so). If they cannot see anything I will get us all on a call so we can go through it in more detail. Thanks for the offer!