fivetran / dbt_ad_reporting

Fivetran's ad reporting dbt package. Combine your Facebook, Google, Pinterest, LinkedIn, Twitter, Snapchat, Microsoft, TikTok, Reddit, Amazon, and Apple Search advertising metrics using this package.
https://fivetran.github.io/dbt_ad_reporting/#!/overview
Apache License 2.0
150 stars 56 forks source link

BUG - Failed unique test results for google ads source #28

Closed oscarlukersmithsc closed 3 years ago

oscarlukersmithsc commented 3 years ago

Are you a current Fivetran customer? Oscar Lukersmith at SafetyCulture

Describe the bug We have started receiving failed test results for unique_stg_google_ads__final_url_performance_final_url_performance_id

Completed with 1 error and 0 warnings:

Failure in test unique_stg_google_ads__final_url_performance_final_url_performance_id (models/stg_google_ads.yml)
  Got 81684 results, configured to fail if != 0

  compiled SQL at target/compiled/google_ads_source/models/stg_google_ads.yml/schema_test/unique_stg_google_ads__final_u_3521f620458941ffffe40e69979b6e54.sql

The unique key doesn't appear to be unique in the underlying table image

Steps to reproduce N/A

Expected behavior For the test to pass, and to have a unique key

Project variables configuration


name: 'analytics'
version: '2.0.0'
config-version: 2

profile: 'default'

source-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
data-paths: ["data"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

vars:
  'dbt_date:time_zone': 'UTC'
  ad_reporting__pinterest_enabled: False
  ad_reporting__microsoft_ads_enabled: True
  ad_reporting__linkedin_ads_enabled: True
  ad_reporting__google_ads_enabled: True
  ad_reporting__twitter_ads_enabled: False
  ad_reporting__facebook_ads_enabled: True
  ad_reporting__snapchat_ads_enabled: False

target-path: "target"  # directory which will store compiled SQL files
clean-targets:         # directories to be removed by `dbt clean`
    - "target"
    - "dbt_modules"

models:
  +pre-hook: "{{ alter_quota(1000) }}"
  +post-hook: "{{ grant_permissions('select', 'ia', 'yes') }}"
  +bind: false
  analytics:
    exposure:
      tableau:
        +post-hook: "{{ grant_permissions('select', 'tableau') }}"

  # disable both pinterest models if not using pinterest ads
  pinterest:
    enabled: false

  pinterest_source:
    enabled: false

  # disable both twitter ads models if not using twitter ads
  twitter_ads:
    enabled: false

  twitter_ads_source:
    enabled: false

  # disable both snapchat ads models if not using snapchat ads
  snapchat_ads:
    enabled: false

  snapchat_ads_source:
    enabled: false

  # disable both google ads models if not using google ads
  linkedin:
    enabled: true
  linkedin_source:
    enabled: true

  google_ads:
    enabled: true
  google_ads_source:
    enabled: true

  microsoft_ads:
    enabled: true
  microsoft_ads_source:
    enabled: true

Warehouse

Additional context

Screenshots

Please indicate the level of urgency High Urgency

Are you interested in contributing to this package?

kristin-bagnall commented 3 years ago

@oscarlukersmithsc thanks for submitting this bug. Quick question for you - which version of the package are you currently using (from your packages.yml file)?

oscarlukersmithsc commented 3 years ago
- package: fivetran/ad_reporting
    version: 0.4.0
oscarlukersmithsc commented 3 years ago

I've just realised this is not the latest version.

oscarlukersmithsc commented 3 years ago

When I try and upgrade to the latest version

Compilation Error
  dbt found two resources with the name "google_ads__url_ad_adapter". Since these resources have the same name,
  dbt will be unable to find the correct resource when ref("google_ads__url_ad_adapter") is used. To fix this,
  change the name of one of these resources:
  - model.google_ads.google_ads__url_ad_adapter (models/url_adwords/google_ads__url_ad_adapter.sql)
  - model.google_ads.google_ads__url_ad_adapter (models/url_google_ads/google_ads__url_ad_adapter.sql) Code: 10004
kristin-bagnall commented 3 years ago

Do you see this error if you run dbt run --full-refresh?

oscarlukersmithsc commented 3 years ago

Unfortunately, I can't do a dbt run as its a compilation error

kristin-bagnall commented 3 years ago

Oh right, makes sense.

kristin-bagnall commented 3 years ago

I think what may be happening is that api_source variable introduced in the latest google_ads package might be overwritten somehow. Can you add this variable to your dbt_project.yml file, according to these instructions and let me know if that works?

oscarlukersmithsc commented 3 years ago

I have added it like this, but I still get the same error. I agree it seems to be getting confused between adwords and google_ads

source-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
data-paths: ["data"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

vars:
  'dbt_date:time_zone': 'UTC'
  ad_reporting__pinterest_enabled: False
  ad_reporting__microsoft_ads_enabled: True
  ad_reporting__linkedin_ads_enabled: True
  ad_reporting__google_ads_enabled: True
  ad_reporting__twitter_ads_enabled: False
  ad_reporting__facebook_ads_enabled: True
  ad_reporting__snapchat_ads_enabled: False
  api_source: google_ads  ## adwords by default
oscarlukersmithsc commented 3 years ago

I am actually using the adwords connector FYI

kristin-bagnall commented 3 years ago

Hmm, okay for some reason the first line of code in this model and this model aren't work as expected for you.

If you're using the adwords connector, this was designed to work without you needing to make any changes.

It might be best if you'd be willing to hop on an office hours with someone from the team to debug live. Here's the link to do that.

cc: @fivetran-joemarkiewicz and @DylanBaker

oscarlukersmithsc commented 3 years ago

Hey @fivetran-kristin no worries, unfortunately, that's 5am-6am my time as I am in Sydney, Australia. Is there an Australian-friendly office hours?

fivetran-joemarkiewicz commented 3 years ago

Hi @oscarlukersmithsc my apologies I was OOO on Friday. After reading through this thread the reason for the model compilation error in the google_ads models is due to the adwords vs Google Ads API functionality that was introduced in the latest release of the package. The package now uses a separate model file based on if you are using the adwords or Google Ads API that you can set like @fivetran-kristin mentioned above.

However, I noticed in your dbt_project.yml that if you set the models the have enabled: true for google_ads and google_ads_source then you are overriding the variable parameter and enabling both the adwords and Google Ads models. You can resolve the error by removing the google_ads enable config. Typically you only need to add the enabled config if you are disabling. I will add stronger logic for these models that you should not add them to your config if you are keeping them enabled.

If you use the below dbt_project.yml config then you should see the compilation error be resolved. Let me know if the unique key test continues to fail as well. I just checked with my own data and noticed the key to be unique. If the issue still persists I would recommend reaching out to your CSM to help identify why your source data would have these duplicates.

name: 'analytics'
version: '2.0.0'
config-version: 2

profile: 'default'

source-paths: ["models"]
analysis-paths: ["analysis"]
test-paths: ["tests"]
data-paths: ["data"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

vars:
  'dbt_date:time_zone': 'UTC'
  ad_reporting__pinterest_enabled: False
  ad_reporting__microsoft_ads_enabled: True
  ad_reporting__linkedin_ads_enabled: True
  ad_reporting__google_ads_enabled: True
  ad_reporting__twitter_ads_enabled: False
  ad_reporting__facebook_ads_enabled: True
  ad_reporting__snapchat_ads_enabled: False

target-path: "target"  # directory which will store compiled SQL files
clean-targets:         # directories to be removed by `dbt clean`
    - "target"
    - "dbt_modules"

models:
  +pre-hook: "{{ alter_quota(1000) }}"
  +post-hook: "{{ grant_permissions('select', 'ia', 'yes') }}"
  +bind: false
  analytics:
    exposure:
      tableau:
        +post-hook: "{{ grant_permissions('select', 'tableau') }}"

  # disable both pinterest models if not using pinterest ads
  pinterest:
    enabled: false

  pinterest_source:
    enabled: false

  # disable both twitter ads models if not using twitter ads
  twitter_ads:
    enabled: false

  twitter_ads_source:
    enabled: false

  # disable both snapchat ads models if not using snapchat ads
  snapchat_ads:
    enabled: false

  snapchat_ads_source:
    enabled: false

  # disable both google ads models if not using google ads
  linkedin:
    enabled: true
  linkedin_source:
    enabled: true

  # Removed google ads enabled config as that was causing the compilation error.

  microsoft_ads:
    enabled: true
  microsoft_ads_source:
    enabled: true

You can also remove the other enabled: true model configs if you would like.

oscarlukersmithsc commented 3 years ago

Hey @fivetran-joemarkiewicz ! Thanks so much for the response, that indeed fixed that problem. Unfortunately, it is still failing the unique test runs. I have tried to resync the appropriate table through fivetran and it still fails the test. Any ideas?

Failure in test unique_stg_google_ads__final_url_performance_final_url_performance_id (models/adwords_connector/stg_google_ads.yml)
  Got 83573 results, configured to fail if != 0

  compiled SQL at target/compiled/google_ads_source/models/adwords_connector/stg_google_ads.yml/schema_test/unique_stg_google_ads__final_u_3521f620458941ffffe40e69979b6e54.sql
fivetran-joemarkiewicz commented 3 years ago

@oscarlukersmithsc below are the fields that make up final_url_performance_id and which should be unique to my understanding and are unique using our Google Ads data. https://github.com/fivetran/dbt_google_ads_source/blob/871ceccaf8d2432c70f26117a157a03035b22d06/models/adwords_connector/stg_google_ads__final_url_performance.sql#L61

{{ dbt_utils.surrogate_key(['date_day','campaign_id','ad_group_id','final_url']) }} as final_url_performance_id

Would you be able to query one of the final_url_performance_id fields that is showing as a duplicate and confirm if it is indeed an intentional duplicate that we may have not accounted for in the package, or if it is an unintentional duplicate that needs to be addressed at the connector level.

Thanks!

oscarlukersmithsc commented 3 years ago

Hey @fivetran-joemarkiewicz, it looks like Fivetran is definitely ingesting data which would cause that to not be unique. I have copied a sample (and anonymized) where all records have the same final_url_performance_id. It looks like Fivetran is triggering multiple sync for that same "id" https://docs.google.com/spreadsheets/d/1YAkFcznqZOm6NPCbOjKf76kr8uVZVu7T3ULPv1u1a80/edit#gid=0

fivetran-joemarkiewicz commented 3 years ago

@oscarlukersmithsc thanks for digging into this and confirming it looks to be an issue on the connector side. Unfortunately our team that maintains the dbt packages does not have visibility into these occurrences.

I would recommend reaching out to your account manager or csm to help open a ticket as to why Fivetran would be bringing in these duplicates.

fivetran-joemarkiewicz commented 3 years ago

@oscarlukersmithsc let me know if you have been able to contact your account manager or customer support rep for next steps. If so, we should be able to close this issue on the dbt package side.

oscarlukersmithsc commented 3 years ago

Hey @fivetran-joemarkiewicz, I have opened a support ticket with the support team. Thankyou