fivetran / dbt_jira

Data models for Fivetran's Jira connector built using dbt.
https://fivetran.github.io/dbt_jira/
Apache License 2.0
8 stars 15 forks source link

[Bug] fivetran_utils.string_agg uses occasionally cause failures due to size limitations on redshift #132

Closed fivetran-joemarkiewicz closed 4 days ago

fivetran-joemarkiewicz commented 3 weeks ago

Is there an existing issue for this?

Describe the issue

Throughout this package we leverage the fivetran_utils.string_agg macro in a few places to aggregate string information for use in downstream enriched models. However, for Redshift destinations there is a string limit within a string datatype which can sometimes result in the below error message. We need to ensure Redshift destinations are able to succeed even if the string length exceeds the limit.

Relevant error log or model output

result size exceeds listagg limit

Expected behavior

There are no failures for Redshift users when using this package

Possible solution

We can either take one of two options:

  1. We remove the string_agg fields if they are not providing value to the end models to ensure this runtime error isn't an issue on any destination. However, this is not ideal to me and something I would recommend we do not pursue as the solution.
  2. Leverage a similar solution we have in the past (such as Mixpanel) where we truncate the string if it exceeds the estimated character threshold. This will address this for Redshift users and not impact the functionality at all for other destinations. However, the downside here is that for these Redshift users we will be truncating the string agg which could result in some data loss.
  3. Some other alternative that has yet been identified which will allow us to retain the information in the string agg and ensure runtime success on destinations like Redshift even when the character length exceeds a certain threshold.

dbt Project configurations

Default config

Package versions

Latest

What database are you using dbt with?

redshift

How are you running this dbt package?

Fivetran Quickstart Data Model, Fivetran Transformations, dbt Core™, dbt Cloud™

dbt Version

Latest

Additional Context

No response

Are you willing to open a PR to help address this issue?

fivetran-joemarkiewicz commented 4 days ago

Closing this issue as it's been addressed in the latest release