fivetran / dbt_hubspot

Data models for Hubspot built using dbt.
https://fivetran.github.io/dbt_hubspot/
Apache License 2.0
33 stars 39 forks source link

BUG - Deal and deal stages data type join mismatch on Redshift #61

Closed scotthibberd closed 2 years ago

scotthibberd commented 2 years ago

Are you a current Fivetran customer? Yes. Scott Hibberd - Infogrid, BI Analyst

Describe the bug Running dbt run returns an error for the hubspot__deals and hubspot__deal_stages models

The error I receive is JOIN/USING types bigint and character varying cannot be matched

Steps to reproduce

  1. Import the Fivetran Hubspot package
  2. Run dbt run

Expected behavior Models should build without error

Project variables configuration

name: 'infogrid_bi'
version: '1.0.0'
config-version: 2

profile: 'default'

model-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
seed-paths: ["data"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

target-path: "target"  # directory which will store compiled SQL files
clean-targets:         # directories to be removed by `dbt clean`
    - "target"
    - "dbt_modules"

vars:

  hubspot__pass_through_all_columns: true 

  hubspot_email_event_spam_report_enabled: false
  hubspot_email_event_print_enabled: false
  hubspot_email_event_forward_enabled: false

Package Version

  - package: fivetran/hubspot
    version: [">=0.5.0", "<0.6.0"]

Warehouse

Additional context For both models, the build fails due to the join within the deal_fields_joined CTE

It fixes for me to rewrite the join as follow:

select 
        deals.*,
        pipelines.pipeline_label,
        pipelines.is_active as is_pipeline_active,
        pipeline_stages.pipeline_stage_label,
        owners.email_address as owner_email_address,
        owners.full_name as owner_full_name

    from deals
    left join pipelines on deals.deal_pipeline_id = cast(pipelines.deal_pipeline_id as bigint)
    left join pipeline_stages on deals.deal_pipeline_stage_id = cast(pipeline_stages.deal_pipeline_stage_id as bigint)
    left join owners on deals.owner_id = owners.owner_id

Screenshots

Please indicate the level of urgency Currently in the process of building Hubspot reporting - we can workaround temporarily by overwriting the package model files, but a longer term solution is preferable.

Are you interested in contributing to this package?

fivetran-joemarkiewicz commented 2 years ago

Thanks for opening this @scotthibberd!

This is a fix we most likely would want to apply to the package. However, we would want to do it a bit differently than the fix you shared to make it dynamic for users using this package on other warehouses.

What we would want to do is cast the fields within the dbt_hubspot_source package and then apply the left join as you have shown, but without the casting. For example, we would cast deal_pipeline_id in the staging model to be the following:

cast(pipeline_id as {{ dbt_utils.type_int() }}) as deal_pipeline_id

This way the casting will be appropriate for users that are not on Redshift as well. Let me know if this makes sense. In the meantime, feel free to open a PR to adjust the joins in this package and I can update the source to cast appropriately, or I can open both PRs by the end of this week. Similarly, let me know if you would like to open the PRs for both and I can review! 😄

scotthibberd commented 2 years ago

That makes sense in terms of making the fix more dynamic. Please could you open the PRs for that?

Thanks for your help.

fivetran-joemarkiewicz commented 2 years ago

Hi @scotthibberd I have been able to open the above noted PRs to apply this quick fix.

I actually found that we should be casting these fields to a string type since the IDs in my data are not consistently integers. Regardless, I was able to apply the fixes and have found the package to compile and run as intended locally. Before I open my PR for review and merge any changes, it would be great if you could give the below version of the package a try. You can simply remove the package dependency in your packages.yml and use the below config:

FYI if you have the hubspot_source package as a depenency, you will want to remove that as well for the branch to work as intended.

packages:
- git: https://github.com/fivetran/dbt_hubspot.git
revision: bugfix/casting-and-joins
warn-unpinned: false

Let me know how the working branch goes!

scotthibberd commented 2 years ago

Amazing - thanks @fivetran-joemarkiewicz ! I've just tested it and that's working for me now.

fivetran-joemarkiewicz commented 2 years ago

@scotthibberd these updates are now live in the latest versions of the hubspot package. Thanks so much for raising this issue.

I will close this issue, but please feel free to open a new issue if you encounter and other questions or errors.