Open drbenvincent opened 1 month ago
Sounds like an interesting idea and should be pretty straight forward to implement. There might be a similar API to the lift tests by using the:
build_model
methodfit
methodcontribution = pm.math.sum(beta*logistic(x, lambda))
Do you mean theta here instead of lambda?
Have some additional thoughts to add when I can
Do you have any idea how much this nudges the likelihood? For instance, with 1 year of weekly data (52 observations), it seems like only a single observation would have a very small effect on the likelihood.
We should be able to try this out with a workflow like this:
mmm = MMM(...)
mmm.build_model(X, y)
with mmm.model as model:
total_contribution = pm.math.sum(model["..."], ...)
pm.Normal("contribution_constraint", mu=0.2, sigma=0.1, observed=total_contribution)
mmm.fit(X, y)
has a consequence on that metric. It might be good to do a before and after using the workflow above and see how much this metric moves.
What are your thoughts?
NOTE: we can probably use the Prior
class here for some general. For instance,
# User can specify a constraint over the total contributions
constraint = Prior("Gamma", mu=0.2, sigma=0.1)
# Behind the scenes in the method
with mmm.model as model:
total_contribution = pm.math.sum(model["..."], ...)
contraint.parameters["observed"] = total_contribution
contraint.create_variable("contribution_constraint")
@wd60622 all good ideas. I'd imagine that if you wanted to put prior knowledge on total contribution, then you might have to make your priors on related transformation or contribution params more uninformative. Will do some experiments in what will turn into a docs page which showcases the feature.
I'm actually wondering if it makes more sense to approach in terms of % contribution rather than absolute sales?
FYI: Got the green light to work on this, so going to start on it this week / next week.
In MMM's we have many parameters. Here, let's focus on parameters associated with the saturation function and weighting. Let's just consider a generic saturation function, the contribution of channel $c$ at time $t$ is given by:
$$ contribution_{c,t} = \betac \cdot f(x{c,t}, \theta_c) $$
where $\beta$ is a channel weighting and $\theta$ could be a parameter (or parameter list) associated with the saturation function $f$.
We might have business intelligence that tells us about a plausible range of channel contributions on average over a period of time. However at the moment this business knowledge can only be encoded by placing priors over $\beta$ and $\theta$. In many cases it may not be as easy to express business knowledge over these parameters.
So my proposal is to allow the used to express prior knowledge over the total contribution of a channel.
We could implement this by adding another likelihood term as follows:
where
mu=0.2
is your prior over the contribution of sales of that channel, and sigma is how confident you are about that.The idea is that the user could still provide prior knowledge on $\beta$ and $\theta$, but if they really couldn't express their knowledge in that parameter space, then these could be left as relatively uninformative priors. At least at the start of an iterative process.
I don't yet have a suggested API at this point, but I think this could be a really nice addition. Allowing users to express priors that are closer to business intelligence could be a big win.
After proposing this issue, it was pointed out that it is similar to the more vague #939, and also #1038.