pymc-labs / pymc-marketing

Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
https://www.pymc-marketing.io/
Apache License 2.0
717 stars 201 forks source link

Is there a way to do nested models using pymc_marketing #940

Open shuvayan opened 3 months ago

shuvayan commented 3 months ago

Hello Bayesian Wizards,

There are times when nested models are needed to model the interactions , as below :

Screenshot 2024-08-16 at 6 54 16 PM

I am thinking if we could add this feature to pymc_marketing - it would greatly help!

Please let me know if this is possible in the current library or if it can be developed I can contribute to it (with guidance)!

wd60622 commented 3 months ago

Might be related to this new doc showing linear effects can be better estimated via use of lift tests: https://www.pymc-marketing.io/en/latest/notebooks/mmm/mmm_roas.html

What are you assumptions of the model?

CC: @juanitorduz

juanitorduz commented 3 months ago

I think it is more related to https://engineering.hellofresh.com/bayesian-media-mix-modeling-using-pymc3-for-fun-and-profit-2bd4667504e6

@shuvayan A good starting point would be exploring this PyMC model from HelloFrsh's blog and translating it to PyMC-Marketing. You will probably need to generate some data yourself :)

shuvayan commented 3 months ago

Thank you @wd60622 and @juanitorduz . I will go through the links and see if I can come up with something.

shuvayan commented 3 months ago

Hello @wd60622 & @juanitorduz ,

I think an implementation of the nested structure based on pymc would be something like the below:

import pymc as pm
import numpy as np

# Simulated data
n = 500
np.random.seed(42)

# Channels and their relationships
Channel_2 = np.random.normal(10, 2, n)
Channel_3 = np.random.normal(5, 1, n)
Channel_4 = np.random.normal(7, 1.5, n)

# Channel_1 dependent on Channel_2 and Channel_3
with pm.Model() as channel_1_model:
    alpha_1 = pm.Normal("alpha_1", mu=0, sigma=1)
    beta_12 = pm.Normal("beta_12", mu=0, sigma=1)
    beta_13 = pm.Normal("beta_13", mu=0, sigma=1)
    epsilon_1 = pm.Normal("epsilon_1", mu=0, sigma=1, shape=n)

    Channel_1 = pm.Deterministic("Channel_1", alpha_1 + beta_12 * Channel_2 + beta_13 * Channel_3 + epsilon_1)

# Channel_5 dependent on Channel_1 and Channel_4
with pm.Model() as channel_5_model:
    alpha_5 = pm.Normal("alpha_5", mu=0, sigma=1)
    beta_51 = pm.Normal("beta_51", mu=0, sigma=1)
    beta_54 = pm.Normal("beta_54", mu=0, sigma=1)
    epsilon_5 = pm.Normal("epsilon_5", mu=0, sigma=1, shape=n)

    Channel_5 = pm.Deterministic("Channel_5", alpha_5 + beta_51 * Channel_1 + beta_54 * Channel_4 + epsilon_5)

# Final Revenue dependent on all channels
with pm.Model() as revenue_model:
    gamma_0 = pm.Normal("gamma_0", mu=0, sigma=1)
    gamma_1 = pm.Normal("gamma_1", mu=0, sigma=1)
    gamma_2 = pm.Normal("gamma_2", mu=0, sigma=1)
    gamma_3 = pm.Normal("gamma_3", mu=0, sigma=1)
    gamma_4 = pm.Normal("gamma_4", mu=0, sigma=1)
    gamma_5 = pm.Normal("gamma_5", mu=0, sigma=1)

    epsilon_revenue = pm.Normal("epsilon_revenue", mu=0, sigma=1, shape=n)

    Revenue = pm.Deterministic("Revenue", 
                               gamma_0 + 
                               gamma_1 * Channel_1 + 
                               gamma_2 * Channel_2 + 
                               gamma_3 * Channel_3 + 
                               gamma_4 * Channel_4 + 
                               gamma_5 * Channel_5 + 
                               epsilon_revenue)

# Combine the models into a final hierarchical model
with pm.Model() as hierarchical_model:
    channel_1_model = channel_1_model.build()
    channel_5_model = channel_5_model.build()
    revenue_model = revenue_model.build()

    trace = pm.sample(1000, tune=1000)

If this seems right, can you help me in modifying specific modules of pymc_marketing so that these changes can be integrated and merged? If I am missing something here, please do point it out.

juanitorduz commented 3 months ago

Hey @shuvayan ! This looks interesting. A good place to start is to create en example notebook. Ideally we would use pymc-marketing models as components of the nested models (as in the HelloFresh blog post). Would you like to try this out?

shuvayan commented 3 months ago

Hello @juanitorduz ,

Yes, I could not find any implementation details in the hello fresh blog , hence used pymc method to kickoff. If you could guide me with the specific pointers for pymc_marketing (by sharing some sample code/notebook ) I can start to modify to relevant parts within pymc-marketing on my local machine and try things out. Let me know if it makes sense!

juanitorduz commented 3 months ago

I think a good starting point is to simulate data yourself (or find a pubic data set) and fit the direct and indirect models using the MMM class in pymc-marketing (direct and indirect models from the HelloFres blog post). After you fitted both models, you can estimate the adjusted CAC as described in the blog post. Feel free to open a draft PR and we can take it form there :)

shuvayan commented 4 weeks ago

Hello @juanitorduz ,

MMM_Funnels.pdf

After a lot of trial and error I think I have a working version. Can I put in a pull request with this or does it need any major modifications , please?

I am not very sure about linking the funnel stages but we might use the lower funnel spends as controls for the middle funnel and henceforth. That part is not implemented here, for brevity and exclusions.

Also , I am not being able to upload the notebook here so that is a bummer!!