fivetran / dbt_microsoft_ads

Fivetran data transformations for Microsoft (Bing) Ads built using dbt.
https://fivetran.github.io/dbt_microsoft_ads/
Apache License 2.0
3 stars 6 forks source link

[Bug] Keywords in URL report causing timeout and fanout #16

Closed fivetran-joemarkiewicz closed 1 year ago

fivetran-joemarkiewicz commented 1 year ago

Is there an existing issue for this?

Describe the issue

The latest release of the package introduced new logic into the microsoft_ads__url_report which joins in the keyword report if auto tagging is enabled. However, even if auto-tagging is not enabled, the join still occurs. This join is causing timeout errors for users and is taking upwards of 30 minutes before erroring out.

Additionally, users have shared that the url report shows a drastically higher spend total than the microsoft_ads__ad_report. My understanding is that the keywords report join in the url report is causing a fanout that duplicates the records and has the potential to time out for customers with large data volumes.

Relevant error log or model output

Timeout error

Expected behavior

The microsoft_ads__url_report model does not timeout and does not cause a fan out error.

dbt Project configurations

N/A

Package versions

v0.5.0

What database are you using dbt with?

bigquery

dbt Version

v1.1.1

Additional Context

I am unsure how the fanout is occurring, but I do believe it is happening. I think an effective fix would be to allow a user to disable this component if they are not using autotagging. However, we still will want to know why the fan out is happening at all.

Are you willing to open a PR to help address this issue?

fivetran-joemarkiewicz commented 1 year ago

This seems to be a result of 1 Ad : Many Keywords. Therefore, we believe the join in the model has probably multiplied the report.spend.

What we could do is select the first keyword that maps into the ad if we want to include a coalesce default for utm_term. Alternatively, we could just remove it and just coalesce to "utm_term" value.

fivetran-joemarkiewicz commented 1 year ago

Here is a dbt Slack thread where we discussed this in more detail.