fivetran / dbt_google_ads_source

Fivetran data models for Google Ads built using dbt.
https://fivetran.github.io/dbt_google_ads_source/
Apache License 2.0
10 stars 20 forks source link

[Feature] Add ability to union multiple Google Ads connectors together #34

Closed zhyatt closed 12 months ago

zhyatt commented 1 year ago

Is there an existing feature request for this?

Describe the Feature

I noticed that the Facebook Pages Source dbt package provides an option to union together multiple instances of Facebook Pages connectors for Fivetran: https://hub.getdbt.com/fivetran/facebook_pages_source/latest/. That option isn't available for the Google Ads Source package (https://hub.getdbt.com/fivetran/google_ads_source/latest/), but I am interested in having that included so I can utilize it.

This would likely use the existing union_data function in dbt_fivetran_utils package (https://github.com/fivetran/dbt_fivetran_utils/tree/v0.4.2/#union_data-source) similar to how it was done in Facebook Pages.

Describe alternatives you've considered

We currently work around this limitation with two main changes:

  1. We created models to union the source data together and push it to some central views
  2. We override the variables that define the source input to this package to change them to use the ref() function (a bit hacky), which preserves the ability to have fully integrated scheduling in Fivetran. Here is an example of how the overrides were done in dbt_project.yml:
    google_ads_source:
    account_history: "{{ ref('stg_google_ads_combined__account_history') }}"
    account_stats: "{{ ref('stg_google_ads_combined__account_stats') }}"
    ad_group_criterion_history: "{{ ref('stg_google_ads_combined__ad_group_criterion_history') }}"
    ad_group_history: "{{ ref('stg_google_ads_combined__ad_group_history') }}"
    ad_group_stats: "{{ ref('stg_google_ads_combined__ad_group_stats') }}"
    ad_history: "{{ ref('stg_google_ads_combined__ad_history') }}"
    ad_stats: "{{ ref('stg_google_ads_combined__ad_stats') }}"
    campaign_history: "{{ ref('stg_google_ads_combined__campaign_history') }}"
    campaign_stats: "{{ ref('stg_google_ads_combined__campaign_stats') }}"
    geo_target: "{{ ref('stg_google_ads_combined__geo_target') }}"
    keyword_stats: "{{ ref('stg_google_ads_combined__keyword_stats') }}"
    label: "{{ ref('stg_google_ads_combined__label') }}"

Are you interested in contributing this feature?

Anything else?

No response

elanfivetran commented 1 year ago

Hey @zhyatt, thanks for submitting this feature request!

The Union Schema is a very popular feature and we are aware that customers are interested in having this feature included in a number of packages that currently don't have it. I'll note this as another request for adding the feature to the Google Ads package and will get back with when we can expect to have this implemented.

fivetran-catfritz commented 1 year ago

Hi @zhyatt, to update you, I will working on this union update this quarter. Stay tuned!

fivetran-catfritz commented 1 year ago

Hi @zhyatt, I have created a test branch with the ability to union multiple ad connectors together. If you are able to try it out, you can install it using the below code in your packages.yml in place of the normal google_ads_source lines.

- git: https://github.com/fivetran/dbt_google_ads_source.git
  revision: MagicBot/add-union-schema
  warn-unpinned: false

Also, below are the instructions from the README on how to set it up. Let us know if you have any questions about this as well. We look forward to any feedback you have!

The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the source_relation column of each model. To use this functionality, you will need to set either the google_ads_union_schemas OR google_ads_union_databases variables (cannot do both) in your root dbt_project.yml file:

vars:
    google_ads_union_schemas: ['google_ads_usa','google_ads_canada'] # use this if the data is in different schemas/datasets of the same database/project
    google_ads_union_databases: ['google_ads_usa','google_ads_canada'] # use this if the data is in different databases/projects but uses the same schema name

Please be aware that the native source.yml connection set up in the package will not function when the union schema/database feature is utilized. Although the data will be correctly combined, you will not observe the sources linked to the package models in the Directed Acyclic Graph (DAG). This happens because the package includes only one defined source.yml.

To connect your multiple schema/database sources to the package models, follow the steps outlined in the Union Data Defined Sources Configuration section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

zhyatt commented 1 year ago

@fivetran-catfritz Thanks for notifying me of this update, great to see it happening. I was able to test this branch using the setup instructions provided and got it working, so it looks like the change will work for us! 🎉

I will note that installing this via the git approach above results in the error Found duplicate project "google_ads_source". This occurs when a dependency has the same project name as some other dependency.. This was after removing the other package approach to installing google_ads_source and running a dbt clean && dbt deps on the project.

Based on a separate issue with the same error, this appears to be a dependency resolution problem due to using the git approach along with using the google_ads package as well (which has a dependency on google_ads_source). I was able to work around this by installing the packages separately and then manually overwriting the google_ads_source package with the test version. I did this due to some other dependency issues that weren't trivial to work though and thought it might be worth noting in case others run into this or there is an easy way to avoid this.

Thanks again for the update!

fivetran-catfritz commented 1 year ago

Hi @zhyatt thanks for the update! I'm glad the source was working out for you. I didn't realize you were also using our google_ads transform package--you can use the below branch instead. It is a test branch for google_ads that will pull in the revised google_ads_source package.

- git: https://github.com/fivetran/dbt_google_ads.git
  revision: MagicBot/add-union-schema
  warn-unpinned: false

As for the other dependency issues, we did cut a release earlier today that caused some issues for users, but we have since rolled it back, so wondering if that will also resolve things for you.

zhyatt commented 1 year ago

Thanks @fivetran-catfritz , that branch for the google_ads package worked to resolve the dependency issue I was facing. 🙌

fivetran-catfritz commented 1 year ago

Awesome! We should officially be releasing the new version with these changes in the next couple weeks. We'll keep you posted!

fivetran-catfritz commented 12 months ago

This feature has been added to v0.10.0 and released today! Closing out this issue, but please continue to let us know any feedback!