Closed clay-walker closed 1 year ago
Hi @clay-walker thanks for opening this issue. Someone from my team will follow up here in the next few days to discuss the issue further and determine a possible solution.
hey @clay-walker i definitely agree that match_type
should be included in the surrogate key!
as for your second point, does microsoft hard-delete keyword_id
from the daily search report? i am wondering if changing this line to reference report.keyword_id
(instead of keyword_history) would result in no nulls. i must admit i'm a little apprehensive to apply an inner join 😨
also just curious - do you know if microsoft hard deletes IDs from other history tables (ie campaign or ad_group?)
looks like someone else is seeing this issue with the microsoft ad_group report as well - https://github.com/fivetran/dbt_microsoft_ads/issues/18. and we've just confirmed that keyword_id
or other ids are not hard-deleted from the report tables.
so my next question would be, what's your opinion on referencing the report.keyword_id
instead, so that if a keyword has been hard-deleted, you'd still have its ID but not any other info about it from the keyword_history
table? would that be valuable or confusing/unnecessary to have, as opposed to filtering them out with an inner join? i presume that some people would at least want to keep them 🤔
i am thinking that the path forward will be
id
(such as keyword_id
) in the final CTE of a model, but from the right side of the join. we should change this to reference the report on the left side of the joinHi @clay-walker the latest version of the package (v1.0.4
) should address this issue you identified within this bug report. Feel free to upgrade to the latest version and the initial issue should be addressed.
For the time being I will close this issue as the fix has been rolled out. Please feel free to reopen or create a new issue if you experience any other questions when using the package.
Is there an existing issue for this?
Describe the issue
The Microsoft ads search report model uses the following columns as a surrogate key:
This combination of columns does not uniquely identify a record, and the test fails. The cause of this is:
match_type (a concatenation of two fields) is used in the select from the base table, but not included in the surrogate key. The two fields that are concatenated into match_type are actually part of the key for the base table (bingads.keyword_performance_daily_report). This is a screenshot of Fivetran as evidence:
Additionally, keyword_id is part of the composite key, but that field is left joined from the keyword history table. Since it's left joined, and keywords are apparently hard deleted from the history table (wtf Microsoft!), the resulting value will be
null
, which is no bueno for a field that is used in a surrogate key.Proposed solution:
Relevant error log or model output
Expected behavior
Passing tests
dbt Project configurations
vars: prod_database: analytics prod_schema: business operdb_schema_pattern: 'operdb%' raw_data_db: GONG "dbt_date:time_zone": "America/Los_Angeles" timezone_constant: "America/Los_Angeles"
ad_reportingapple_search_ads_enabled: False # by default this is assumed to be True ad_reportingpinterest_ads_enabled: False # by default this is assumed to be True ad_reportingtwitter_ads_enabled: False # by default this is assumed to be True ad_reporting__facebook_ads_enabled: False # by default this is assumed to be True ad_reportingsnapchat_ads_enabled: False # by default this is assumed to be True ad_reporting__tiktok_ads_enabled: False # by default this is assumed to be True
linkedin_ads_schema: linkedin_ads linkedin_ads_database: source_raw
google_ads_schema: google_ads google_ads_database: source_raw
microsoft_ads_schema: bingads microsoft_ads_database: source_raw
Package versions
packages:
package: fivetran/salesforce_formula_utils version: 0.6.4
package: calogica/dbt_date version: 0.5.7
package: fivetran/ad_reporting version: [">=1.0.0", "<1.1.0"]
What database are you using dbt with?
snowflake
dbt Version
1.0.1
Additional Context
No response
Are you willing to open a PR to help address this issue?