pymc-labs / pymc-marketing

Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
https://www.pymc-marketing.io/
Apache License 2.0
705 stars 198 forks source link

How to add basic priors from uplift experiments? #1024

Open BrianMiner opened 2 months ago

BrianMiner commented 2 months ago

Thank you for just a wonderful library! Really amazing progress!

I wanted to make sure that the approach to using information from historical experiments to alter priors is correct / reasonable.

Background:

We have either the result of an experiment where in certain geos we exclude a given marketing channel and measure the uplift when the channel was active versus withheld OR we have estimates from a third party on what they have seen historically through experiments in a given industry. The result of both of these would be an interval of revenue per dollar spent in a channel (i.e. ROAS). Thus there is a low and a high value.

The MMM has as its target variable revenue and the form of the marketing exposure is spend.

Example

Here are some example values for a particular channel. Lets say this is the "high" end of the interval.

revenue_per_dollar_spent = 1.18   # from experiment
abs_max_target = 15000               # max value of the outcome of the mmm - i.e. revenue
abs_max_media = 4000                # max value of the spend in a period for this channel

So, if we spend 1 dollar we generate 1.18 dollars. Thus in 'dollar space' we expect the mmm coefficient to be around 1.18

We need to divide both numerator and denominator of this ratio revenue_per_dollar_spent / 1 by their respective max abs values.

In the scaled model space we expect the media coefficient to be around (revenue_per_dollar_spent / abs_max_target) / (1/abs_max_media) = 0.3147

This is correct?

If we repeat this for the "low" end of the interval we might find its value is, say, 0.272 We could then pick a distribution for the prior and use something like this to parameterize it:

pm.find_constrained_prior(pm.Gamma, lower = 0.272, upper = 0.3147, mass = 0.95, init_guess = dict(alpha = 1, beta = 1), options = {'maxiter' : 50000})

wd60622 commented 1 month ago

Hi,

Which transformation are you using? Which variable in the model would you use that Gamma for?

BrianMiner commented 1 month ago

Hi @wd60622

I want to say, "does it matter?" since we are talking about the prior on the beta: image

Given the form of the model, this coefficient should represent the revenue generated per dollar spent on the channel (assuming the output and input are both in dollars). So I was thinking this was correct and was looking for confirmation on how the scaling was considered.

In terms of the Gamma, I guess that could be any prior we chose for beta. Could have considered a half-normal just as well.

BrianMiner commented 1 month ago

Checking if anyone can confirm?

wd60622 commented 1 month ago

Hi @BrianMiner

I don't think that that beta has that interpretation for all transformations as there is the non-linearity, that's why I ask. Some saturations have different interpretability. For the most part, that beta will be the asymptotic behavior of the media variable which might not exactly be informed by the values you are describing.

This is why that lift test formulation uses observations of the saturation curve instead of a single change to the prior(s): it incorporates the saturation point as well as asymptotic behavior.

Are you able to fit your experiment results into the framework used for lift tests?

BrianMiner commented 1 month ago

@wd60622

I need to look closer at the lift tests, I have basically estimates of revenue per dollar of spend from the experiments.

My thinking was that even though the spend for a given channel at a given point goes through adstock and a saturation, at the end of the day, the value being multiplied by the beta is the spend for that channel in that period, albeit adjusted to be the spend the other functions decided were relevant at that time. So I didnt think the saturation function mattered and the interpretation of the beta was the same as any linear model.

BrianMiner commented 1 month ago

If you get a chance, can you explain this a bit more: "For the most part, that beta will be the asymptotic behavior of the media variable which might not exactly be informed by the values you are describing."

wd60622 commented 3 weeks ago

@BrianMiner That is the parameter interpretation for many of the saturation functions. The mulitiplier (usually beta) will determine the asymptotic behavior of the media variable.

The beta are left in Logistic but you can see that there is a parameter to control the asymptotic value: https://pymc-marketing-app.streamlit.app/Saturation