Clarification Question - Leakage

ar-asur commented 1 year ago

Hi Team

Great package and really amazing work!

I have a few questions to better understand the variables and the general modeling. In my case, I have few channels with spend tied to the KPI that I am modeling as the dependent variable. For eg: signups is the dependent variable and the channel could be like partner websites, referral. The amount spend on these channels depend on the dependent variable. I am confused to include them as paid_media_spend since it feels like leakage and are not organic_vars as well. Any thoughts on how to include them in the model? I would really appreciate if you could advise on the above

Thanks!

sziolko commented 1 year ago

There are challenges with the default trend detection for similar reasons in that it is the outcome variable decomposed used to predict the outcome variable.

For something like an affiliate marketing scenario I have found that the CPA estimates do end up being very similar to the actual costs associated with those referrals. It is also reflected in the ad stock/decay curves where the spend only has an immediate impact. We keep it in the model as paid_media_spend because that best reflects the contribution to our business and without it other channels would artificially benefit by getting attribution for those conversions.

gufengzhou commented 1 year ago

Thanks both for sharing the question and the answers. It's tricky how to treat these channels with "kick-backs", because they're endogenous/ not independent. Couple of options:

Some will just leave them out of the model and you could consider subtract those conversions from the dependent variable from the beginning.
Others might want to use exposure metrics (clicks/imps etc) instead of spend to build the model. However, Robyn doesn't use exposure metrics for the fitting due to its impact on budget allocation. Often times, clients also don't always get exposure metrics on a granular level.
Either way, you might not want to include these channels in the budget allocation, because you can't "scale up the spend" to get more response.

ar-asur commented 1 year ago

Thank you @sziolko and @gufengzhou for your thoughts. I was also thinking the same as @gufengzhou first point of removing them from the model and subtracting the signups, as they are not controllable as the other channels. For affiliate, My idea was to use the exposure metrics. Not sure if I understood your point about exposure metrics not used for fitting? When it comes to budget allocation I understand, I would need spend data but for fitting, why can't I control for it using exposure metrics as context_vars

@sziolko could you elaborate on the CPA estimates for affiliate

Thanks!

sziolko commented 1 year ago

@ar-asur regarding the CPA estimates. When including the affiliate network spend and the conversions that they generated, I have found that the Robyn model has done a good job of reflecting the costs in the model output (one-pagers etc).

A made up example. If I pay $10 per conversion to my affiliate network, have 10 other channels (social, offline, etc) and am predicting my total conversions. I might show $1010 in spend for the affiliate network for a week and those 101 conversions are also in my weekly conversion totals. When Robyn runs and I get a CPA for each of my channels in the one-pager we are seeing that the CPA for affiliates is close to the cost (e.g. $10 in this example, perhaps the one pager says $11.11 or $9.31 etc, but only off by a reasonable %).

If the affiliate CPA is very different from my known spend, then I have reason to be skeptical of that particular model -- e.g. if the models indicate that my affiliate channel is $0.50 CPA or $50 CPA then I have to question what else are they getting credit for or what other channel is highly correlated with the affiliate channel.

ar-asur commented 1 year ago

@sziolko Got it, Thanks for the detailed explanation!

facebookexperimental / Robyn

Clarification Question - Leakage #712