A validation test exists within this package to ensure that no combination of rows grouped by campaign ID, source relation, and _fivetran_synced timestamp should appear more than once. However, I'm encountering an issue in our raw data where multiple campaigns have campaign names but no campaign IDs. This test therefore fails and I'm unable to build a campaign-level model.
Relevant error log or model output
with validation_errors as (
select
source_relation, campaign_id, _fivetran_synced
from ANALYTICS_DEV_AKSHAY_KAMAL.STAGING.stg_pinterest_ads__campaign_history
group by source_relation, campaign_id, _fivetran_synced
having count(*) > 1
)
select *
from validation_errors;
Expected behavior
I expected a unique ID to be assigned to my campaign data since it was originally synced via the Pinterest API. This may also be a source data quality issue, but in my investigation into the Pinterest Ads and Pinterest Ads Source packages, I encountered a number of macros from other packages that might be performing this work.
dbt Project configurations
models:
+sql_header: "alter session set timezone = 'UTC';"
+transient: "{{ false if target.name.upper() in ('PROD', 'DAGSTER_CLI_PROD', 'DAGSTER_SDA_PROD') else true }}"
allbirds_dbt:
marts:
+materialized: table
commercial:
digital:
+schema: commercial_digital
retail:
+schema: commercial_retail
core:
+materialized: table
+schema: core
finance_accounting:
+schema: finance_accounting
internal_metadata:
snowflake:
+schema: internal_metadata__snowflake
observability:
+schema: observability
operations:
+materialized: table
+schema: operations
scaffolds:
+materialized: view
+schema: staging
staging:
+materialized: view
+schema: staging
dbt_artifacts:
+database: _dbt_artifacts
+schema: internal_metadata__dbt_artifacts
ad_reporting:
+schema: staging
facebook_ads:
+schema: staging
facebook_ads_source:
+schema: staging
google_ads:
+schema: staging
google_ads_source:
+schema: staging
microsoft_ads:
+schema: staging
microsoft_ads_source:
+schema: staging
pinterest:
+schema: staging
pinterest_source:
+schema: staging
tiktok_ads:
+schema: staging
tiktok_ads_source:
+schema: staging
twitter_ads:
+schema: staging
twitter_ads_source:
+schema: staging
seeds:
+schema: internal_metadata
vars:
'dbt_date:time_zone': 'UTC'
surrogate_key_treat_nulls_as_empty_strings: true
allbirds_dbt:
google_analytics_360__fivetran_cutover_date: '2022-07-01'
unified_analytics__ga360_cutover_date: '2024-01-01 00:00:00.000 +0000'
unified_analytics__rudderstack_start_date: '2022-12-01 00:00:00.000 +0000'
unified_analytics__default_marketing_channel: 'Not_Set'
unified_analytics__attribution_lookback_days: 14
shopify__fivetran_instance_names: [
'shopifyau',
'shopifyca',
'shopifyeu',
'shopifyjp',
'shopifynz',
'shopifysk',
'shopifyuk',
'shopifyus'
]
loop__fivetran_instance_names: [
'fivetran_loop_returns_ca',
'fivetran_loop_returns_eu',
'fivetran_loop_returns_uk',
'fivetran_loop_returns_us',
'fivetran_loop_returns_ca_dev',
'fivetran_loop_returns_eu_dev',
'fivetran_loop_returns_uk_dev',
'fivetran_loop_returns_us_dev'
]
facebook_ads_schema: facebook_ads
facebook_ads_database: fivetran
google_ads_schema: google_ads
google_ads_database: fivetran
microsoft_ads_schema: bing_ads
microsoft_ads_database: fivetran
pinterest_schema: pinterest_ads_7_7_1
pinterest_database: fivetran
tiktok_ads_schema: tiktok_ads
tiktok_ads_database: fivetran
twitter_ads_schema: twitter_ads
twitter_ads_database: fivetran
ad_reporting__amazon_ads_enabled: False # by default this is assumed to be True
ad_reporting__apple_search_ads_enabled: False # by default this is assumed to be True
ad_reporting__linkedin_ads_enabled: False # by default this is assumed to be True
ad_reporting__reddit_ads_enabled: False # by default this is assumed to be True
ad_reporting__snapchat_ads_enabled: False # by default this is assumed to be True
apple_search_ads__using_search_terms: False # by default this is assumed to be True
pinterest__using_keywords: False # by default this is assumed to be True
twitter_ads__using_keywords: False # by default this is assumed to be True
Is there an existing issue for this?
Describe the issue
A validation test exists within this package to ensure that no combination of rows grouped by campaign ID, source relation, and _fivetran_synced timestamp should appear more than once. However, I'm encountering an issue in our raw data where multiple campaigns have campaign names but no campaign IDs. This test therefore fails and I'm unable to build a campaign-level model.
Relevant error log or model output
Expected behavior
I expected a unique ID to be assigned to my campaign data since it was originally synced via the Pinterest API. This may also be a source data quality issue, but in my investigation into the Pinterest Ads and Pinterest Ads Source packages, I encountered a number of macros from other packages that might be performing this work.
dbt Project configurations
Package versions
What database are you using dbt with?
snowflake
dbt Version
dbt 1.7.5
Additional Context
No response
Are you willing to open a PR to help address this issue?