Figure out API for incorporating information from lift tests on MMM

pymc-labs / pymc-marketing

Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.

https://www.pymc-marketing.io/

Apache License 2.0

589 stars 138 forks source link

Figure out API for incorporating information from lift tests on MMM #60

Closed ricardoV94 closed 2 months ago

ricardoV94 commented 1 year ago

@juanitorduz mentioned that in the literature this is usually incorporated into the prior information, but I think that in the HelloFresh project @lucianopaz found a way to incorporate such information via observed variables in the same MMM model?

lucianopaz commented 1 year ago

Yes, you can add the lift test measurements as observations. The way in which you go about doing this depends on what the lift test actually measures and how the experiment is performed. For the HelloFresh project, we added them as imperfect observations of the incremental CAC of media channels.

lucianopaz commented 1 year ago

I'm curious @juanitorduz, could you share some links on the prior information stuff?

juanitorduz commented 1 year ago

Hey! I have not gone into depth regarding this subject but is my immediate task.

At the moment I am using costs as priors as done in Google's LightWeight MMM, see https://github.com/google/lightweight_mmm/blob/main/lightweight_mmm/models.py#L350
Where I have read lift test being included in priors in Uber's work: Bayesian Time Varying Coefficient Model with Applications to Marketing Mix Modeling, see Section 4.1.2 4.1.2 Experimentation Calibration (It was a while since I read this)

lucianopaz commented 1 year ago

The Google code seems to set a prior on the scale of the coefficient, but it doesn’t constrain its mean, using something like a Gamma or LogNormal, and the Orbit paper doesn’t explain what they really did. They just say that they “ingest some observations as priors”. They don’t explain how they do that. The approach that we had followed was quite crude: we assumed that the lift test was the observation of a random variable. The mean of the distribution was computed from the estimate of the target thing that was being measured during the period of the lift test. So it was nothing very fancy, and it could be improved by incorporating how the lift test was performed. I’m not sure how many details I can share about this though, so I’m being vague on purpose.

juanitorduz commented 1 year ago

Thanks for the comments. Indeed the explanations are generally vague ... and I believe is not only because privacy but also because the approach might depend on the nature of the lift test. That being said, if we are able to go provide a simple (does not need to be fancy and if it follows an heuristic is great!) but effective framework for MMM calibration, this could be a key factor against competitors 😉.

juanitorduz commented 2 months ago

I think with the great work by @wd60622 in https://github.com/pymc-labs/pymc-marketing/pull/590, we can close this one 🚀 !