google / lightweight_mmm

LightweightMMM 🦇 is a lightweight Bayesian Marketing Mix Modeling (MMM) library that allows users to easily train MMMs and obtain channel attribution information.
https://lightweight-mmm.readthedocs.io/en/latest/index.html
Apache License 2.0
865 stars 177 forks source link

lag_weight is way too high for a specific media channel (0.99 lag_weight) #136

Open steven-struglia opened 1 year ago

steven-struglia commented 1 year ago

Hello!

I'm wondering if there is a fix to a faulty Adstock lag_weight parameter that is learned in my MMM. I'm using the hill_adstock function with some holiday features as extra features. Currently, for all other media channels (13 others) the lag_weightparameter is looking great, and the model has a very low MAPE both in and out of sample with a high R2 (win!). However, this faulty lag_weight parameter was found because I was suspicious of the contribution percentage that this channel is taking up. Currently, it's sitting at around 40% contribution (channel #7 in contribution chart) which should not be the case (AFAIK) when it has the worst saturation curve and we have other media channels with better saturation curves + more spending thrown into them. I know that this lag_weight parameter is off since it would not make sense for it to carry over 99% of value day over day. I thought of the idea of using a custom prior that brings the lag_weight value of this channel to something in the 0.6-0.7 ballpark if that's at all possible. If this is a good idea, please let me know!

image

image

michevan commented 1 year ago

Hi @steven-struglia !

Setting a more informative prior sounds like a reasonable thing to do here. I'd also recommend checking the correlation coefficients and VIF values for this channel (see the data quality analysis section in the example Colabs) to see if that gives you any more insight into what's causing this behavior.

steven-struglia commented 1 year ago

Hi @michevan !

Nothing wrong with the correlation coefficients, variance, VIF, or low spend for this channel following what was in the Colab example. I'm going to go the route of a custom prior, how do I access the lag_weight prior for this channel specifically?

Default is numpyro.distributions.Beta(concentration1=2., concentration0=1.) with shape (c, ), so would I just feed it in like this?

lag_weights = [numpyro.distributions.Beta(concentration1=2., concentration0=1.)]*len(media_columns)
lag_weights[7] = numpyro.distributions.Beta(concentration1=1.5., concentration0=1.)
custom_priors = {'lag_weight': lag_weights}

numpyro.set_platform('gpu')

mmm = lightweight_mmm.LightweightMMM(model_name="hill_adstock")

mmm.fit(
    media=media_data_train, 
    media_prior=costs,
    target=target_train,
    extra_features=extra_features_train,
    media_names = mdsp_cols,
    custom_priors = custom_priors,
    seed=72,
    target_accept_prob = 0.90
)
michevan commented 1 year ago

I think so! I'd have to replicate it with mock data to see if the format is precisely correct, but that looks by eye like a reasonable way to do this.

steven-struglia commented 1 year ago

Turns out the above way doesn't work. I tried a bunch of different ways to feed it in, but I'm not certain it's possible for me to edit a single lag_weight parameter and leave the rest the same by going through custom_priors. If you know of a way for me to obtain that that would be great! Thanks in advance @michevan

It errors out in the fit() method with

TypeError: __init__() missing 1 required positional argument: 'concentration0'

michevan commented 1 year ago

try: c1 = np.array([2] * len(media_columns) c1[7] = 1.5 c0 = np.ones(len(media_columns) custom_priors = {'lag_weight': {'concentration1': c1, 'concentration0': c0}}

steven-struglia commented 1 year ago

That works beautifully! Thanks so much @michevan

I'll let you know what results come out of this using custom priors.

steven-struglia commented 1 year ago

Also if you have a recommendation on which parameters of the Beta distribution to change and in which direction to change them in order to put a max cap on the lag_weight for any of the channels (not just this one), that would be great! @michevan

michevan commented 1 year ago

This is unfortunately going to depend on your specific datasets, but there's a lot of discussion about these functional shapes and a bunch of examples in the original Google paper on this methodology.

aneverhart commented 1 year ago

Hello!

I was wanting to do the same thing as above, but I get this error: "OverflowError: Python int 139850521891472 too large to convert to int32"

Literally doing the same thing as above that michevan provided.

Any ideas?

wregter commented 1 year ago

@aneverhart

Hi, I am also running into the overflow error in my own project. Please let me know if you find out what causes it.

dmaruo-hdsb commented 1 year ago

@aneverhart @wregter Hi, I also got same error and I fixed that problem within my env, by setting the jax config firstly

from jax.config import config
config.update('jax_enable_x64', True)

jax xla does not support 64bit however LMMM is Lightweight so that problem is not so serious. https://jax.readthedocs.io/en/latest/notebooks/Common_Gotchas_in_JAX.html#caveats

gittybobomber commented 1 month ago

try: c1 = np.array([2] * len(media_columns) c1[7] = 1.5 c0 = np.ones(len(media_columns) custom_priors = {'lag_weight': {'concentration1': c1, 'concentration0': c0}}

Thanks @michevan that works. But shouldn't we prefer to achieve smaller instead of bigger lag_weight, so that adstock / decay effect is rather small than big, avoiding "never ending" media effects? If this is the case, then the default values lag_weights = [numpyro.distributions.Beta(concentration1=2., concentration0=1.)]*len(media_columns) should be the other way round, or even alpha=1 (concentration1) and beta=4 (concentration0) like this: BetaFigure But if I try

c1 = np.array([1] * len(media_columns)  
c0 = np.array([4] * len(media_columns) 
custom_priors = {'lag_weight': {'concentration1': c1, 'concentration0': c0}}

I get error RuntimeError: Cannot find valid initial parameters. Please check your model again. I can make it possible for single media columns like

c1[[0,1,4,7,8,9,10,11,13]] = 1
c0[[0,1,4,7,8,9,10,11,13]] = 4

but not for all. How can I fix this?

P.S. it works without error with a different data set, what is the logic behind this? P.P.S. it has something to do with zeroes in the media_data: when a media channel has a lot of zero values, the above mentioned RuntimeError occurs. When I replace zeros by a small number like e.g. 0.00001 it works without error. The error occurs when a media channel has more than 25% zero values, e.g. 22% is no problem. But how can this be solved? Media has lots of zero days/weeks in reality!