fivetran / dbt_salesforce_formula_utils

Package containing dbt macros to help generate salesforce formula fields synced from Fivetran.
https://fivetran.github.io/dbt_salesforce_formula_utils/
Apache License 2.0
16 stars 18 forks source link

[Bug] Join of two tables does not respect Fivetran history mode #93

Closed icemtel closed 1 year ago

icemtel commented 1 year ago

Is there an existing issue for this?

Describe the issue

I am using Fivetran in History mode (which enables SCD2 on source tables).

On Task object in salesforce, I have a formula, that basically looks up a field on the task owner: Owner:User.SomeFieldName__c.

In this case, salesforce_formula_utils.sfdc_formula_view compiles into a join of salesforce.task and salesforce.user. Because both of the tables are in history mode, user.id is not unique and join produces a huge table, much larger than the original salesforce.task.

Here's the dbt model code I'm using:

{{ salesforce_formula_utils.sfdc_formula_view(
    source_table='task',
    full_statement_version=true) 
}}

Which gets compiled into:

SELECT mt.id,
           mt._fivetran_start,mt._fivetran_end,mt._fivetran_active,
           jt1.some_field_name_c AS xyz,
FROM salesforce.task AS mt
LEFT JOIN salesforce.user AS jt1 ON mt.owner_id=jt1.id 

Note: I removed irrelevant columns and formatted for readability.

Relevant error log or model output

No response

Expected behavior

When we join to get extra information, it should never change number of rows.

If the package does not support history mode, this should be clearly labelled. If you want to support history mode as well, then you may need to decide whether to join rows from the past with rows from the past, or with the active ones.

dbt Project configurations

I believe the problem does not depend on my configurations.

Package versions

dbt deps
08:32:52  Running with dbt=1.5.0
08:32:55  Installing dbt-labs/codegen
08:32:55  Installed from version 0.10.0
08:32:55  Up to date!
08:32:55  Installing calogica/dbt_date
08:32:56  Installed from version 0.7.2
08:32:56  Up to date!
08:32:56  Installing fivetran/salesforce_formula_utils
08:32:56  Installed from version 0.9.0
08:32:56  Up to date!
08:32:56  Installing dbt-labs/dbt_utils
08:32:57  Installed from version 1.1.1
08:32:57  Up to date!
08:32:57  Installing fivetran/fivetran_utils
08:32:57  Installed from version 0.4.7
08:32:57  Up to date!

What database are you using dbt with?

bigquery

dbt Version

1.5.0

Additional Context

No response

Are you willing to open a PR to help address this issue?

icemtel commented 1 year ago

On the other hand, running with full_statement_version=false produces the expected behavior (it does not include the formula that caused the issue and there's no join in the compiled code).

fivetran-joemarkiewicz commented 1 year ago

Hi @icemtel thanks for raising this issue and I can see how this would cause an issue by not including additional conditions within the join to account for the history modes.

Unfortunately, this package only materializes the sql provided within the fivetran_formula and fivetran_formula_model tables synced by your Salesforce connector. The actual sql generation and translation is performed within the connector and by our connector team. Therefore, I will not be able to help resolve the error directly within this package.

However, you may open a support ticket and work directly with our customer support and product teams to have our engineers explore the issue you are seeing and work through a solution.

icemtel commented 1 year ago

@fivetran-joemarkiewicz done ✅

fivetran-joemarkiewicz commented 1 year ago

Great! I will close this issue as the support ticket was the way to go!

icemtel commented 1 year ago

In case anyone stumbles into this - Fivetran support has fixed this - only _fivetran_active rows are be joined.