fivetran / dbt_mailchimp

Fivetran's Mailchimp dbt package
https://fivetran.github.io/dbt_mailchimp/
Apache License 2.0
3 stars 5 forks source link

[Bug] `dbt.datediff()` in `mailchimp__campaign_activities` errors in Redshift #41

Closed fivetran-catfritz closed 10 months ago

fivetran-catfritz commented 11 months ago

Is there an existing issue for this?

Warehouse

Describe the issue

Redshift user was receiving the below error:

04:56:42  Database Error in model mailchimp__campaign_activities (models/mailchimp__campaign_activities.sql)
04:56:42    function pg_catalog.date_diff("unknown", timestamp with time zone, timestamp with time zone) does not exist

The reason for the error is in Redshift, datediff does not work on datatype "timestamp with time zone". It must be converted to "timestamp without time zone" to run without error. I was able to reproduce the error and correct it by casting to dbt.type_timestamp().

This is causing errors in mailchimp__campaign_activities for some Redshift users due to these lines 16-18.

To fix this, we can either cast the fields in question at the source or in the final model. While typically we try to adjust casts at the source, I think we could go either way in this instance. On one hand, we might want to preserve the timezone data in the upstream models, in case the user finds it useful. In this model, since we are only interested in the difference and not the timezone, the timezone does not matter. On the other hand, converting everything to UTC "timestamp with time zone" has been our standard, and maintaining that predictability might be the most important thing here.

If we cast in the end model, I got the below to work--sharing in case it is useful:

{{ dbt.datediff('cast(campaigns.send_timestamp as dbt.type_timestamp())','cast(activities.activity_timestamp as dbt.type_timestamp())',
    'minute') }} as time_since_send_minutes,
{{ dbt.datediff('cast(campaigns.send_timestamp as dbt.type_timestamp())','cast(activities.activity_timestamp as dbt.type_timestamp())',
    'hour') }} as time_since_send_hours,
{{ dbt.datediff('cast(campaigns.send_timestamp as dbt.type_timestamp())','cast(activities.activity_timestamp as dbt.type_timestamp())'),
    'day') }} as time_since_send_days

I don't have a strong opinion which way we go here--just my thoughts while investigating on this issue.

fivetran-joemarkiewicz commented 10 months ago

This fix is now live in the latest release of the package.