fivetran / dbt_hubspot

Data models for Hubspot built using dbt.
https://fivetran.github.io/dbt_hubspot/
Apache License 2.0
33 stars 38 forks source link

[Bug] Missing column in intermediate model if variable is set #102

Closed antoon-r closed 1 year ago

antoon-r commented 1 year ago

Is there an existing issue for this?

Describe the issue

When running project with var hubspot_email_event_status_change_enabled: false, it fails with Database Error

Problem is on this line: https://github.com/fivetran/dbt_hubspot/blob/7d76e9431ab44dea1ec2a85ad27b81902b7b16fb/models/marketing/hubspot__email_sends.sql#L43

When column unsubscribes is not created, it should not be accessed, too. It can be fixed by using custom email_metrics array in dbt_project.yml, but it is impossible to figure it out without looking into the code of package

Relevant error log or model output

09:39:53  Database Error in model hubspot__email_campaigns (models/marketing/hubspot__email_campaigns.sql)
09:39:53    Name unsubscribes not found inside email_sends at [65:25]
09:39:53    compiled Code at target/run/hubspot/models/marketing/hubspot__email_campaigns.sql
09:39:53  
09:39:53  Database Error in model int_hubspot__email_metrics__by_contact_list (models/marketing/intermediate/int_hubspot__email_metrics__by_contact_list.sql)
09:39:53    Unrecognized name: unsubscribes at [74:13]
09:39:53    compiled Code at target/run/hubspot/models/marketing/intermediate/int_hubspot__email_metrics__by_contact_list.sql
09:39:53  
09:39:53  Database Error in model hubspot__contacts (models/marketing/hubspot__contacts.sql)
09:39:53    Unrecognized name: unsubscribes at [112:13]
09:39:53    compiled Code at target/run/hubspot/models/marketing/hubspot__contacts.sql

Expected behavior

Run does not fail with Database Error

dbt Project configurations

vars: hubspot_email_event_status_change_enabled: false

Package versions

packages:

What database are you using dbt with?

bigquery

dbt Version

1.3.1

Additional Context

No response

Are you willing to open a PR to help address this issue?

fivetran-joemarkiewicz commented 1 year ago

HI @antoon-r thanks for opening this bug report. I agree with you that these downstream models fail when the hubspot_email_event_status_change_enabled variable is set to false, but the other variables are enabled. I believe this may have been missed because we may have not come across a user who has email_event_status_change turned off, but has all other email events turned on. Regardless, this is something we will want to address within the package.

Upon further scoping I found the following additional problems areas:

Based on the above, it is clear that we should not be iterating on a field that could possibly not exist in the data based if a variable is set to false. I believe what may be the optimal path forward is to add the functionality to the package to dynamically infer the metrics. If the metric doesn't exist, then it can be skipped.

Nevertheless, we will add this to our upcoming sprint to be addressed. Thanks again for raising this to our team!

fivetran-catfritz commented 1 year ago

Hi @antoon-r I've created a test branch for you to try out. To use it, simply replace the dbt_hubspot section in your packages.yml with the following:

- git: https://github.com/fivetran/dbt_hubspot.git
  revision: bug/email-metrics-variable
  warn-unpinned: false

You should be able to use this new branch without any further modifications. Please let me know how it works for you!

antoon-r commented 1 year ago

Hi @fivetran-catfritz this branch fixes the problem, thank you.