fivetran / dbt_hubspot_source

Data models for Hubspot built using dbt.
https://fivetran.github.io/dbt_hubspot_source/
Apache License 2.0
33 stars 30 forks source link

[Bug] custom fields unexpectedly change datatype #97

Closed kandji-alex closed 1 year ago

kandji-alex commented 1 year ago

Is there an existing issue for this?

Describe the issue

We're ingesting custom Hubspot fields via hubspot__deal_pass_through_columns argument.

When these are timestamps, some appear to be created in Hubspot's database as a NUMBER(38) while others are TIMESTAMP_TZ.

We're converting the NUMBER ones into timestamps in downstream dbt models.

HOWEVER, one of these fields has changed from a NUMBER to a TIMESTAMP_TZ unexpectedly. This broke our dbt snapshots on this data.

The Hubspot users have ensured me they did nothing to that field. Perhaps Hubspot internally changed something in their database, or perhaps the end users did alter something without realizing.

Either way, this field was being built as a NUMBER for months, and then randomly switched to a TIMESTAMP_TZ. I was able to find the query in the Snowflake history, in which Fivetran issued a simple alter statement to this table and started upserting with the new time after that. I'm not able to pin down an event that would've caused this change.

Relevant error log or model output

:25:07  Completed with 1 error and 0 warnings:
19:25:07  
19:25:07  Database Error in snapshot hubspot__deal__snapshot (snapshots/hubspot_source/hubspot__deal__snapshot.sql)
19:25:07    002023 (22000): SQL compilation error:
19:25:07    Expression type does not match column data type, expecting NUMBER(38,0) but got TIMESTAMP_TZ(9) for column PROPERTY_CHANNEL_INFLUENCED_DATE
19:25:07    compiled Code at xxx.sql

Expected behavior

I expect these source tables with custom fields to never change underlying datatype unless something actively seeks to do so.

dbt Project configurations

hubspot__deal_pass_through_columns:

Package versions

packages:
  - package: elementary-data/elementary
    version: [">=0.7.0", "<0.8.0"]
  - package: get-select/dbt_snowflake_monitoring
    version: 2.0.2
  - package: fivetran/google_ads
    version: [">=0.9.0", "<0.10.0"]
  - package: fivetran/hubspot
    version: [">=0.8.0", "<0.9.0"]
  - package: fivetran/facebook_ads
    version: [">=0.6.0", "<0.7.0"]
  - package: fivetran/linkedin
    version: [">=0.6.0", "<0.7.0"]
  - package: fivetran/ad_reporting
    version: [">=1.2.0", "<1.3.0"]
  - package: dbt-labs/codegen
    version: [">=0.9.0", "<0.10.0"]
  - package: dbt-labs/dbt_external_tables
    version: 0.8.2
  - package: Snowflake-Labs/dbt_constraints
    version: [">=0.6.0", "<0.7.0"]

What database are you using dbt with?

snowflake

dbt Version

  - installed: 1.3.1
  - latest:    1.4.4 - Update available!

Plugins:
  - snowflake: 1.3.0 - Update available!

Additional Context

No response

Are you willing to open a PR to help address this issue?

fivetran-joemarkiewicz commented 1 year ago

Hi @kandji-alex thanks for opening this issue.

Perhaps Hubspot internally changed something in their database

I believe you hit the nail on the head the head with this statement. HubSpot recently release a new API and as far as I am aware one of the implications was that all time fields were converted to timestamp_tz. There was nothing that was adjusted within this dbt package (although we are working to update the package to account for the API changes), but rather the change came from the Fivetran connector adjusting to account for the new HubSpot API.

I am currently working at the moment to understand these HubSpot changes further. However, our support and product teams should be able to add more information and details around the connector change. I would highly recommend opening a support ticket where you will likely be able to get more insight around this connector change.

fivetran-jamie commented 1 year ago

the fix for this should officially be in v0.9.0 of the package!