google / lightweight_mmm

LightweightMMM 🦇 is a lightweight Bayesian Marketing Mix Modeling (MMM) library that allows users to easily train MMMs and obtain channel attribution information.
https://lightweight-mmm.readthedocs.io/en/latest/index.html
Apache License 2.0
829 stars 172 forks source link

Media contribution percentage sums up to 105% #266

Open Aanai opened 8 months ago

Aanai commented 8 months ago

I've built a few models (they converged) and I've noticed this with all of them.

When you sum the media contribution percentage (the dataframe is obtained using plot.create_media_baseline_contribution_df) from all the media channels + the baseline contribution, the sum is around 105%

Why is this the case? Shouldn't it be 100%? @becksimpson

becksimpson commented 8 months ago

@Aanai I believe, there is no guarantee it'll be 100%. To calculate those contributions: First it calculates the sum_across_samples_percent_contribution of each individual media channel, and baseline. I believe baseline_contribution_pct and media_contribution_pct_by_channel should sum to 1. You should check this.

These percents are then multiplied by posterior_pred_df["avg_prediction"], which has been clipped at 0. It is the average prediction across samples of the target. So at the very least, you are calculating the contributions to the prediction and not the actual. So there is no guarantee that sum(prediction) = sum(actual), if you mean that the sum contribution of channels & baseline > actual target. If you mean the contributions sum to 105% of the prediction, this could be because you are not comparing to the zero-clipped target prediction.