fivetran / dbt_apple_store

Fivetran's Apple App Store source dbt package
https://fivetran.github.io/dbt_apple_store/
Apache License 2.0
0 stars 2 forks source link

[Bug] apple_store__subscription_report joins on null values #17

Closed awoehrl closed 8 months ago

awoehrl commented 8 months ago

Is there an existing issue for this?

Describe the issue

In apple_store__subscription_report, there is a join on state but that field will be null for most countries. This results in metrics like event_renew, event_cancel, event_subscribe being 0 as the rows are not joined correctly.

    select 
        reporting_grain.date_day,
        reporting_grain.account_id,
        reporting_grain.account_name, 
        reporting_grain.app_id,
        reporting_grain.app_name,
        reporting_grain.subscription_name, 
        case 
            when country_codes.alternative_country_name is null then country_codes.country_name
            else country_codes.alternative_country_name
        end as territory_long,
        reporting_grain.country as territory_short,
        reporting_grain.state,
        country_codes.region, 
        country_codes.sub_region,
        coalesce(subscription_summary.active_free_trial_introductory_offer_subscriptions, 0) as active_free_trial_introductory_offer_subscriptions,
        coalesce(subscription_summary.active_pay_as_you_go_introductory_offer_subscriptions, 0) as active_pay_as_you_go_introductory_offer_subscriptions,
        coalesce(subscription_summary.active_pay_up_front_introductory_offer_subscriptions, 0) as active_pay_up_front_introductory_offer_subscriptions,
        coalesce(subscription_summary.active_standard_price_subscriptions, 0) as active_standard_price_subscriptions
        {% for event_val in var('apple_store__subscription_events') %}
        {% set event_column = 'event_' ~ event_val | replace(' ', '_') | trim | lower %}
        , coalesce({{ 'subscription_events.' ~ event_column }}, 0)
            as {{ event_column }} 
        {% endfor %}
    from reporting_grain
    left join subscription_summary
        on reporting_grain.date_day = subscription_summary.date_day
        and reporting_grain.account_id =  subscription_summary.account_id 
        and reporting_grain.app_name = subscription_summary.app_name
        and reporting_grain.subscription_name = subscription_summary.subscription_name
        and reporting_grain.country = subscription_summary.country
        **and reporting_grain.state = subscription_summary.state**
    left join subscription_events
        on reporting_grain.date_day = subscription_events.date_day
        and reporting_grain.account_id =  subscription_events.account_id 
        and reporting_grain.app_name = subscription_events.app_name
        and reporting_grain.subscription_name = subscription_events.subscription_name
        and reporting_grain.country = subscription_events.country
        **and reporting_grain.state = subscription_events.state**
    left join country_codes
        on reporting_grain.country = country_codes.country_code_alpha_2

Relevant error log or model output

No response

Expected behavior

Correctly get all required rows from all joins

dbt Project configurations

-

Package versions

What database are you using dbt with?

bigquery

dbt Version

1.7.7

Additional Context

No response

Are you willing to open a PR to help address this issue?

fivetran-jamie commented 8 months ago

hi there @awoehrl thanks so much for opening this and taking the time to contribute your PR! your code change seems very straightforward and sensible to me -- our team will prioritize reviewing and merging it in in the upcoming sprint 🙂

fivetran-joemarkiewicz commented 8 months ago

@awoehrl thanks again for opening this issue and the corresponding PR! This is now live in the latest version. As such, I will close out this Issue. Thank you again 😄