fivetran / dbt_jira_source

Fivetran's Jira source dbt package
https://fivetran.github.io/dbt_jira_source/
Apache License 2.0
11 stars 11 forks source link

BUG - Errors in the transformation of dbt_jira_model #23

Closed finners2 closed 2 years ago

finners2 commented 2 years ago

Are you a current Fivetran customer?

Finlay Wotton Data Analyst Zilch Technology

Describe the bug

Database Error in model jira__issue_enhanced (models/jira__issue_enhanced.sql)
  function pg_catalog.date_diff("unknown", timestamp with time zone, timestamp with time zone) does not exist
  HINT:  No function matches the given name and argument types. You may need to add explicit type casts.
  compiled SQL at target/run/jira/models/jira__issue_enhanced.sql

Seeing this error appear when transforming the jira data using the dbt-fivetran-transformation-model.

Occuring in the Jira_model under the jira__issue_enhanced.sql script.

Expected behavior Expect the transform to occur without errors

Project variables configuration

name: 'jira'
version: '0.5.1'
config-version: 2
require-dbt-version: ">=0.20.0"

vars:

  jira:
    issue: "{{ ref('stg_jira__issue') }}"
    project: "{{ ref('stg_jira__project') }}"
    user: "{{ ref('stg_jira__user') }}"
    issue_type: "{{ ref('stg_jira__issue_type') }}"
    status: "{{ ref('stg_jira__status') }}"
    status_category: "{{ ref('stg_jira__status_category') }}"
    resolution: "{{ ref('stg_jira__resolution') }}"
    priority: "{{ ref('stg_jira__priority') }}"
    issue_multiselect_history: "{{ ref('stg_jira__issue_multiselect_history') }}"
    issue_field_history: "{{ ref('stg_jira__issue_field_history') }}"
    comment: "{{ ref('stg_jira__comment') }}"
    issue_link: "{{ ref('stg_jira__issue_link') }}"
    component: "{{ ref('stg_jira__component') }}"
    field: "{{ ref('stg_jira__field') }}"
    sprint: "{{ ref('stg_jira__sprint') }}"
    version: "{{ ref('stg_jira__version') }}"

    jira_include_comments: true  # this package aggregates issue comments so that you have a single view of all your comments in the jira__issue_enhanced table. This can cause limit errors if you have a large dataset. Disable to remove this functionality.
    jira_using_sprints: true # disable if you are not using sprints in Jira
    jira_using_versions: true # disable if you are not using versions in Jira
    jira_issue_history_buffer: 1 # in months

models:
  jira:
    +materialized: table # field history tables are configured to be incremental, except for int_jira__field_history_scd, which has to be a table
    +schema: jira
    intermediate:
      +materialized: ephemeral 
      +schema: int_jira # field history models will be written here (and are not ephemeral, so they'll exist in the warehouse)

Package Version

packages:
  - package: fivetran/jira_source
    version: [">=0.3.0","<0.4.0"]

Warehouse

- [ ] BigQuery - [x] Redshift - [ ] Snowflake - [ ] Postgres - [ ] Databricks - [ ] Other (provide details below) **Additional context** **Screenshots**

Please indicate the level of urgency Testing for the first time.

Are you interested in contributing to this package?

fivetran-joemarkiewicz commented 2 years ago

Hi @finners2 thanks so much for opening this issue and I am sorry to see you are experiencing this error.

I have actually seen this same error popping up in other Redshift users over the last few weeks. The result is a timestamp that the package anticipates being of datatype timestamp is actually being synced as timestamp_tz. We have been able to solve this usually by adjusting the source package to convert the field in question from timestamp_tz to timestamp.

I will post back here later today with a working branch you may test to see if the issue still persists.

fivetran-joemarkiewicz commented 2 years ago

@finners2 I apologize for the delayed response and appreciate your patience!

I was able to apply a fix that I believe will address the issue you are experiencing. Would you be able to test the below package in place of your existing Jira dbt package dependency. Once you swap this package dep in you can run dbt deps and then dbt run.

packages:
    - git: https://github.com/fivetran/dbt_jira.git
      revision: hotfix/timestamp-casting
      warn-unpinned: false

I essentially hard cast the stg_jira__issue_field_history model as a timestamp across warehouses to hopefully avoid the timestamp_tz issue you are seeing. Let me know if this addresses the error!

finners2 commented 2 years ago

Hi @fivetran-joemarkiewicz does the package you've linked include both jira_source and jira?

fivetran-joemarkiewicz commented 2 years ago

It does! There is a dependency within the ‘packages.yml’ that installs the source package as well.

finners2 commented 2 years ago

@fivetran-joemarkiewicz just tried to run the transformation again in fivetran and received the exact same error

18 of 22 ERROR creating table model warehouse_ext.jira__issue_enhanced [ERROR in 4.33s]
14:09:32 | CANCEL query model.jira.jira__issue_enhanced......................... [CANCEL]
Database Error in model jira__issue_enhanced (models/jira__issue_enhanced.sql)
  function pg_catalog.date_diff("unknown", timestamp with time zone, timestamp with time zone) does not exist
  HINT:  No function matches the given name and argument types. You may need to add explicit type casts.
  compiled SQL at target/run/jira/models/jira__issue_enhanced.sql
Encountered an error:
FailFast Error in model jira__issue_enhanced (models/jira__issue_enhanced.sql)
  Failing early due to test failure or runtime error

Let me know if I can provide any further detail that could help!

fivetran-joemarkiewicz commented 2 years ago

Hey @finners2 thanks for testing and I am sorry to see this issue is still persisting. Since I am unable to replicate this issue on my end, I believe the next best alternative would be for us to jump on a quick call and look closer into this together.

If you have time, it would be great if you could schedule time via our office hours link and we can troubleshoot (and hopefully resolve) on that call.

Thanks!

finners2 commented 2 years ago

@fivetran-joemarkiewicz no worries, happy to jump on a call with you - booked it in. Is it possible to do any earlier as it's 7.30pm in London and my team will have finished work by then. Would be good to have them on board as I'm still a junior so they understand the data modelling better. Let me know! Cheers

fivetran-joemarkiewicz commented 2 years ago

@finners2 of course! I see you booked time for Monday 12/13. Would 9am CST work for you instead of the time you setup during our office hours?

If so, I can send out a new meeting invite for then. Let me know, thanks!

finners2 commented 2 years ago

@fivetran-joemarkiewicz that would be perfect, thank you Joe! I'll forward on the invite to any members of the team who wish to join if that's ok?

fivetran-joemarkiewicz commented 2 years ago

@finners2 I actually just encountered this same issue with another customer in our dbt_zendesk package and believe I found a resolution. Would you be able to try the hotfix branch again and let me know if it resolves the error?

Essentially, I opted to not use dbt_utils.type_timestamp and instead use timestamp without time zone and found it resolved the issue with the other customer. Let me know if this works! https://github.com/fivetran/dbt_jira_source/blob/aac31d6cdb6e4331786fe336fe42beea1da034ae/models/stg_jira__issue_field_history.sql#L30

finners2 commented 2 years ago

Hey @fivetran-joemarkiewicz just tried running it again but nothing changed unfortunately.

fivetran-joemarkiewicz commented 2 years ago

@finners2 I am sorry to hear it didn't change anything. I actually noticed the casting should be performed within stg_jira__issue as well. I just pushed another update applying the additional casting to the respective fields.

When you have a free moment, would you be able to try again?

finners2 commented 2 years ago

@fivetran-joemarkiewicz Hey man just tried it again and it worked! Thanks for your help with this.

fivetran-joemarkiewicz commented 2 years ago

🎉 Woohoo 🎉

That is great to hear! I will make a few formatting updates to this branch then open a PR to be integrated into the next release for dbt_jira_source. Thanks for all your help in testing this out and for raising the issue to the team!

finners2 commented 2 years ago

Anytime Joe, have a good day!

kpounder commented 2 years ago

@fivetran-joemarkiewicz I'm running into this same issue now with the GitHub source (https://github.com/fivetran/dbt_github/issues/25). After doing some more looking, I think it is because Redshift's datediff function does not support the difference of two timestamptz. I had the idea to modify dbt_utils.datediff to explicitly cast both dates as timestamp for Redshift. Here's my PR: https://github.com/dbt-labs/dbt-utils/pull/483. Thoughts on that? It works locally for me, but I'm also new to Fivetran and dbt. Thanks!

fivetran-joemarkiewicz commented 2 years ago

Hi @kpounder I actually opened a PR on dbt-utils previously to allow the dbt_utils.type_timestamp() macro to cast any redshift dates to a non timezone format. The fix would probably be just adding that cast to the source package for Github.

I can open a PR for that, or feel free to open that if you would like to!

kpounder commented 2 years ago

Hi @fivetran-joemarkiewicz, just to make sure I'm understanding... you're saying to add ::{{ dbt_utils.type_timestamp() }} to all of the timestamp fields in the source GitHub package? Then they'll all be timestamp instead of timestamptz.

gtang31-te commented 1 year ago

hey @fivetran-joemarkiewicz , would your changes to dbt_utils also propagate to https://github.com/fivetran/dbt_salesforce_formula_utils as well? I am running into the same error for that package as well!