pymc-labs / pymc-marketing

Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
https://www.pymc-marketing.io/
Apache License 2.0
683 stars 190 forks source link

How to contribute (and MAP-estimates) #212

Closed ChristianMichelsen closed 1 year ago

ChristianMichelsen commented 1 year ago

Hi,

First of all, thanks a lot for releasing such a great package!

I started looking into CLV and first found the lifetimes package which later led to this package. Even though I much prefer fully Bayesian models, I found that the computational complexity of using the full posterior sometimes lead to infeasible run times in my data set. This was not a problem with lifetimes.

As such, I made some minor tweaks to allow for an approximate MAP procedure which if course is an approximation, but also allows for much shorter run times in cases where that is important.

My changes are located here. If you find them relevant, I want to help contribute them to this package, however, I couldn't find a contribution guide, so how do you prefer external contributions?

Thanks a lot in advance.

Cheers!

ricardoV94 commented 1 year ago

@ChristianMichelsen thanks for taking a look. I don't know if you saw but we already allow for fitting CLV models with MAP in which case the trace has a single point. The summary methods should work fine out of the box with a trace with a single "chain" and "draw" (the MAP)

https://www.pymc-marketing.io/en/stable/api/generated/pymc_marketing.clv.models.basic.CLVModel.fit.html#pymc_marketing.clv.models.basic.CLVModel.fit

Note: It will be renamed to fit_method in the next release.

ColtAllen commented 1 year ago

Hey @ChristianMichelsen,

Thanks for offering to contribute! The latest version of our contributing guide is here:

https://github.com/pymc-labs/pymc-marketing/blob/main/CONTRIBUTING.md

But now that you mention it, a Pull Request Best Practices section would be an excellent addition, so I've created an issue for it.

It also seems your additions resolve https://github.com/pymc-labs/pymc-marketing/issues/168, so if you want to go ahead and create a Draft PR, we can advise you on where to go from there.

I found that the computational complexity of using the full posterior sometimes lead to infeasible run times in my data set. This was not a problem with lifetimes.

@ricardoV94 The new NUTS samplers in the latest pymc release are blazing fast. Are we able to use those yet in pymc-marketing?

ricardoV94 commented 1 year ago

Yes you can pass arbitrary kwargs to pymc.sample via CLVModel.fit, including nuts_sampler which gives access to Nutpie or Jax samplers

ColtAllen commented 1 year ago

Yes you can pass arbitrary kwargs to pymc.sample via CLVModel.fit, including nuts_sampler which gives access to Nutpie or Jax samplers

Would something like pip install pymc[numpyro] work for the JAX samplers? Or do users still need to follow this guide:

https://www.pymc.io/projects/docs/en/latest/installation.html

ricardoV94 commented 1 year ago

Still need to follow the guide, I don't think we have any optional dependencies defined on PyMC. If you think it would make things easier feel free to open an issue on the repo.

ricardoV94 commented 1 year ago

@ChristianMichelsen can we close this issue or is there something you think should be addressed code-wise which is not tracked already in other issues?

ChristianMichelsen commented 1 year ago

I had completely overlooked the general CLVModel and only looked at the BetaGeoModel and GammaGammaModel. Thanks a lot for the quick and thorough response!