pymc-labs / pymc-marketing

Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
https://www.pymc-marketing.io/
Apache License 2.0
705 stars 198 forks source link

Implement cohort level Shifted BetaGeometric model #169

Open ricardoV94 opened 1 year ago

ricardoV94 commented 1 year ago

133 implements the same model, at the user level granularity.

It may make sense to implement a cohort level model. The posterior individual variables for costumers that enrolled at the same time are identical anyway, so it doesn't make sense to sample each one separately if the data comes in large cohorts.

Some of the summary statistics in #167 may also only make sense to the cohort level model.

juanitorduz commented 1 year ago

Do you mean the results from the paper http://brucehardie.com/notes/017/sBG_estimation.pdf ?

ricardoV94 commented 1 year ago

I was not thinking about that specifically, just the original paper (one cohort over time).

In the multiple cohort paper you mentioned they are just doing complete pooling across multiple cohorts right? Would be nice if we didn't need to do anything extra to allow that. Something like:

alpha = pm.HalfNormal.dist()
beta = pm.HalfNormal.dist()
cohort1 = clv.ShiftedBetaGeoCohortModel(..., alpha_prior=alpha, beta_prior=beta)
cohort2 = clv.ShiftedBetaGeoCohortModel(..., alpha_prior=alpha, beta_prior=beta)
cohorts = clv.concatenate_models(cohort1, cohort2)

And it would create a PyMC model that includes cohort1 and cohort2 as submodels, sharing the alpha and beta. Calling cohorts.fit() would then provide the relevant part of the InferenceData to cohort1 and cohort2 so that you could use their special summary/plotting methods.

This would be neat because we could ultimately use more structured hyperpriors across models, even of different nature (e.g., correlation between lifetime and value).

Anyway... I was just talking about the single cohort model here

juanitorduz commented 1 year ago

Interesting! Makes sense! In addition, we could provide a hierarchical model?

ricardoV94 commented 1 year ago

What do you mean by hierarchical model? Pooling across hyperparameters? If so: yes, that was the idea as well