google / lightweight_mmm

LightweightMMM 🦇 is a lightweight Bayesian Marketing Mix Modeling (MMM) library that allows users to easily train MMMs and obtain channel attribution information.
https://lightweight-mmm.readthedocs.io/en/latest/index.html
Apache License 2.0
848 stars 175 forks source link

cost and media data is not clear from the documentation #240

Open chanansh opened 1 year ago

chanansh commented 1 year ago

In MMM usually, the input is the cost per channel per day, external features (e.g. holidays, conferences, etc.), and the target signal. However here, the costs are the total costs and the media data is not the cost. It's not clear how this fits the model description in the literature.

Media data: Containing the metric per channel and time span (eg. impressions per time period). Media values must not contain negative values. Extra features: Any other features that one might want to add to the analysis. These features need to be known ahead of time for optimization or you would need another model to estimate them. Target: Target KPI for the model to predict. For example, revenue amount, and number of app installs. This will also be the metric optimized during the optimization phase. Costs: The total cost per media unit per channel.

How can the model be in ROI units then? And take into account the budget? Does it assume a fixed impression cost? Moreover, why should the cost be the prior? Do assume the budget allocation was close to the optimal one?

becksimpson commented 1 year ago
  1. In MMM's I've seen people use costs (under the argument of varying price for impressions and the varying quality of impressions for some channels damages causality), or impressions (under the argument of a more direct relationship to what users are experiencing) as the media data. You can pass in scaled daily costs as the media_data instead. I personally pass in a mix or impressions for paid media and visitors for organic channels so you're not bound to follow their examples. As I'm interested in saturation and carryover effects for my organic channels as well, I pass them in as media data, not extra features.
  2. The media_prior in LightweightMMM's documents are assumed to be relative total cost, but can technically be any array that represent accurately the relative beliefs of the contribution of scaled-mean-1 media data to the scaled-mean-1 target. So they must capture both the original scale (a higher imp channel means more target), and the channel's effectiveness (more effective channel, more target). (1-mean-scaled-total-costs * 0.15) is the example they show frequently, but I personally use a scaled mean-non-zero daily spend for my paid media channels (as some channels are turned off for long periods of time), and a scaled number of daily visitors x conversion rate (known from other data sources), as my priors for my organic channels.
  3. When it plots plot.plot_response_curves, it assumes fixed average impression price, but it explains in the docs, if your media_data is already costs, you can just pass in np.ones() as the prices parameter, as price:price is 1:1 ratio. For clarity, To plot these curves, for each media channel, it looks at the y-response, the model prediction, if the day after training, it kept all other media channels at 0, and then for the trialled channel it set its impressions to one of 50 equally spread values between 0 --> 1.2 max media_level in training data, seeing the response at each level of media. It plots the difference between y predicted return under these media levels, and having all media set to 0. The x axis is then multiplied by the average price to turn it from impressions scale to spend scale, assuming fixed price. This leads to hard-to-interpret y-scaling anyway, as what it represents, is the additional y-target you see the day after your training set, if you set your media spend (media impressions x average price) to this x-axis amount, however with carryover effects, the majority of this gain may be seen later. This does not impact the shape of the saturation curve, which will still reflect the learned saturation function, but it can make the y-axis scale difficult to interpret.
  4. I believe cost is the prior, as cost captures in theory both the scale and presumed ROI of a channel e.g. we know Twitter is cheaper at the moment for the same number of impressions, as it's of a presumed poorer quality, but any can be used. It doesn't assume budget allocation was close to the optimal one, but it does assume there is some proportionality between a channel's level of spend and the return the channel receives. A channel that only has 2% of your spend should not drive 50% of your target in all likelihood.
  5. The LightweightMMM model object itself only treats media_prior, as the prior for the Beta Coefficients, it knows nothing of budget. If your media_data is costs or impressions, matters not. Only that the media_prior reflects the relative effectiveness & original scale of each channel, its expected contribution to a 1-mean target, given that the model only sees a normalised 1-mean signal for each channel.
  6. If you're talking about optimize_media.find_optimal_budgets,as in assuming budget allocation is close to the optimal one, they only allow by default a variation of (0.8 --> 1.2) x the original media channel spend for each channel. This is common practice in Budget Recommenders to make them actionable by marketing, so the worst perfoming channel isn't just switched off.