Closed aleix-cd closed 1 year ago
Hi @aleix-cd thanks so much for taking the time to open this PR!
After reviewing your write up and taking a look at your suggested updates, I agree that this is something that I feel would be incredibly valuable to add to the package! There are a few suggested changes I can foresee, but altogether I would like to collaborate with you to integrate this into a future release of our package!
I am going to add this review to my up coming sprint and will share more feedback and suggestions then. Thanks again for thinking through this update and working to contribute a valuable addition to the package!
Since we have individual hierarchy types (account, campaign, ad group, ads) in the platform specific models, I think it would make the most sense to have the same hierarchy type variables for ad reporting. This would give the user control over each of the hierarchies and ensure that the metrics they are trying to passthrough are consistent (and maybe even renamed) across platforms to ensure the unioning works as intended. What are your thoughts?
So instead of just
ad_reporting__passthrough_metrics
we could havead_reporting__campaign_passthrough_metrics
. Or would you feel that this is duplicative? I just worry there may be some instances where metrics may not be consistent across hierarchies and I would want to isolate these pass through abilities. Thoughts?Additionally, I have a few suggestions in the review as well. Let me know if you would prefer I elaborate on any of the suggestions.
IMHO it would definitely feel repetitive, but agree that the flexibility it offers should be worth the extra lines. I will push the changes soon, let me know what you think :)
@fivetran-joemarkiewicz At this point, just need to test which is the best way to adapt the get_query()
macro to account for the different variable names :) Will give it a thought later today!
@fivetran-joemarkiewicz OK, I got this up and running! Few comments:
metric.name
did not work in my end, reverted back to metric
. Not sure how you would use it, but strings don't have a name method afaik. Let me know if I am missing something!get_query
in order to add the relevant passthrough metics at each step. I am no longer adding these in the consistent_fields step, let me know if that looks good!else
at the end of the get_query
macro to cast all passthrough_metrics
as floats, so expecting that all metrics can be casted as such {{ dbt.type_float() }}
Thanks @aleix-cd I will take a look later today!
Regarding your comment:
metric.name did not work in my end, reverted back to metric. Not sure how you would use it, but strings don't have a name method afaik. Let me know if I am missing something!
You are correct and I apologize for not mentioning this in my initial statement. All our other passthrough metrics for Ad Reporting use the dictionary structure as opposed to the list of strings. Therefore, I feel we should follow this method as well to be consistent across the ad platforms. For example it would be
vars:
ad_reporting__campaign_passthrough_metrics:
- name: conversions
- name: total_shares
As opposed to
vars:
ad_reporting__campaign_passthrough_metrics: ["conversions","total_shares"]
That being said, I know we have made some enhancements in other packages that support both forms. I will take a look to see how this was done as we may be able to repurpose here as well.
Thanks @aleix-cd I will take a look later today!
Regarding your comment:
metric.name did not work in my end, reverted back to metric. Not sure how you would use it, but strings don't have a name method afaik. Let me know if I am missing something!
You are correct and I apologize for not mentioning this in my initial statement. All our other passthrough metrics for Ad Reporting use the dictionary structure as opposed to the list of strings. Therefore, I feel we should follow this method as well to be consistent across the ad platforms. For example it would be
vars: ad_reporting__campaign_passthrough_metrics: - name: conversions - name: total_shares
As opposed to
vars: ad_reporting__campaign_passthrough_metrics: ["conversions","total_shares"]
That being said, I know we have made some enhancements in other packages that support both forms. I will take a look to see how this was done as we may be able to repurpose here as well.
Ah, my bad! I completely missed that the ad_reporting packages (e.g. google_ads) are using the following fivetran_utils macro to invoque the passthrough metrics:
{{ fivetran_utils.persist_pass_through_columns(pass_through_variable='var', transform = 'sum') }}
I could use this instead of the for loop I'm declaring in each model. What do you think?
@aleix-cd yeah if we could get it to leverage an already existing macro then that would be ideal! Let me know if it doesn't work on your end. Maybe we can look to adjust the macro be more flexible.
@fivetran-joemarkiewicz Alright Joe, this should be ready to review now!
I now treat all the variables as Dicts, and use the fivetran_utils.persist_pass_through_columns()
macro on all reports. Tested on my end and it worked with the following config:
vars:
ad_reporting__url_passthrough_metrics:
- name: conversions
Let me know!
@aleix-cd fantastic! Our next sprint starts Thursday. I will review and work to integrate this into a new release in the coming sprint! I will follow up if I have any other comments or suggestions 😄
transform_sql
Spotted the error:
I was defining campaign_passthrough_metric_dict
as
{% set campaign_passthrough_metric_array_of_dicts = var('ad_reporting__campaign_passthrough_metrics')[0] %}
So only selecting the first element of the array -- in this case, conversions
.
I am now iterating over that list of dictionaries:
{% set campaign_passthrough_metric_array_of_dicts = var('ad_reporting__campaign_passthrough_metrics') %}
{%- for campaign_group_passthrough_metric_dict in campaign_passthrough_metric_array_of_dicts -%}
{%- for campaign_passthrough_metric_value in campaign_passthrough_metric_dict.values() -%}
{%- do final_fields_superset.update({campaign_passthrough_metric_value: campaign_passthrough_metric_value}) -%}
{%- endfor -%}
{%- endfor -%}
And this should be solved! All passthrough_metrics
should now be populating the intermediate models :)
Just applied the requested fixes, @fivetran-joemarkiewicz! Thanks a lot for the tips, appreciate these a lot!
One additional question I wanted to ask was if you had any strong preference for keeping the url_report and ad_report passthrough metrics distinct? I noticed in your code you have a variable for
ad_reporting__ad_passthrough_metrics
and one forad_reporting__url_passthrough_metrics
; however, in our upstream packages we use the same passthrough variable for both reports since they are built from the same base table. I am considering doing the same here to be consistent across the packages.
That was my bad, I did not realise that the upstream models for ad
and url
reports were sharing the same variable. I've added the required changes, so now both ad
and url
reports should be sharing the same passthrough metrics. 👍
We are not using this in prod, so feel free to make any required changes! And thank you for the thorough review, it has been quite a good learning experience! 💯
@aleix-cd just wanted to provide a quick update that there seem to be a few unrelated BigQuery JSON issues that are occurring in the upstream Ad packages. These just popped up following an update from BigQuery and we are currently triaging these issues.
Ultimately, we will be applying fixes to our upstream packages in the coming sprint. These changes will be breaking changes for the upstream packages. Therefore, we are planning to batch this PR with the next round of breaking changes so we don't cut breaking changes back to back.
I appreciate your patience!
The upstream changes are in motion and we can move forward with merging this into the release branch! My plan is to release this early next week!
Are you a current Fivetran customer?
Yes, from a company called
Capdesk
.What change(s) does this PR introduce?
Introducing the variable
ad_reporting__passthrough_metrics
, which will serve as a "common denominator" for those metrics that are present in all of our Ads platforms.In our case, for example, we like to track
conversions
both in Linkedin, Google, Microsoft and Facebook. Given thatad_reporting
was not providing this column, we had to specifically build somead_reporting
models to just add this column.With this PR, we should be able to do so by just giving the
ad_reporting__passthrough_metrics
the names of the metrics we want to add into our final models.On the other hand, this would require a place in the README.md, in order to let package users know that this is possible (and the caveats as well e.g. that this metric needs to be present, and called the same, in all the Ads platforms we are merging). Let me know if I can provide more help in this regard!
Did you update the CHANGELOG?
Does this PR introduce a breaking change?
Did you update the dbt_project.yml files with the version upgrade (please leverage standard semantic versioning)? (In both your main project and integration_tests)
Is this PR in response to a previously created Bug or Feature Request
How did you test the PR changes?
Select which warehouse(s) were used to test the PR
Provide an emoji that best describes your current mood
:dancer: **Feedback** We are so excited you decided to contribute to the Fivetran community dbt package! We continue to work to improve the packages and would greatly appreciate your [feedback](https://www.surveymonkey.com/r/DQ7K7WW) on our existing dbt packages or what you'd like to see next.