pymc-labs / pymc-marketing

Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
https://www.pymc-marketing.io/
Apache License 2.0
705 stars 198 forks source link

Add `BetaGeoBetaBinomModel` #1031

Closed ColtAllen closed 2 months ago

ColtAllen commented 2 months ago

Description

Reopening https://github.com/pymc-labs/pymc-marketing/pull/922 due to the force push.

The BG/BB model is for non-contractual purchase opportunities across discrete time periods; a good example would be sporting events.

Sampling is rather slow because NUTS defaults to compound sampling due to the discrete distributions used in this model. I recommend that https://github.com/pymc-labs/pymc-marketing/pull/707 be merged to speed up performance.

A small addition to CLVBaseModel was required for the input validations, and the dev notebook is rather minimalist right now but will be expanded into a full how-to in a future PR. An inefficiency in distribution_new_customers that is shared by all CLV models was also identified and will be fixed in a separate PR.

Related Issue

Checklist

Modules affected

Type of change


📚 Documentation preview 📚: https://pymc-marketing--1031.org.readthedocs.build/en/1031/

review-notebook-app[bot] commented 2 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

ColtAllen commented 2 months ago

Is there a ruff version discrepancy between the CI and the pre-commit config? I'm not getting these errors when working locally and all dependencies are up to date.

juanitorduz commented 2 months ago

pre-commit.ci autofix

juanitorduz commented 2 months ago

Fixed by https://github.com/pymc-labs/pymc-marketing/pull/1031/commits/b3abdab678ac7ae46a5491d26911cada469d494f using https://github.com/pymc-labs/pymc-marketing/pull/1031#issuecomment-2349538095

@ColtAllen If you wanna add some changed make sure to pull first ;)

juanitorduz commented 2 months ago

@ColtAllen regarding

Sampling is rather slow because NUTS defaults to compound sampling due to the discrete distributions used in this model. I recommend that https://github.com/pymc-labs/pymc-marketing/pull/707 be merged to speed up performance.

Do you see a speed-up with this model or all CLV models?

juanitorduz commented 2 months ago

@ColtAllen

This test is very slow:

6182.49s call     tests/clv/models/test_beta_geo_beta_binom.py::TestBetaGeoBetaBinomModel::test_model_convergence[mcmc-0.1]

Any ideas if we can make it a bit faster?

ColtAllen commented 2 months ago

Do you see a speed-up with this model or all CLV models?

https://github.com/pymc-labs/pymc-marketing/pull/707 only impacts BetaGeoBetaBinomModel.

This test is very slow. Any ideas if we can make it a bit faster?

Merging https://github.com/pymc-labs/pymc-marketing/pull/707 would make it about 3x faster. Only other way to speed it up would be to use a smaller fit dataset. There's another test here that is also rather slow, but addressing the TODOs will fix that.

juanitorduz commented 2 months ago

Ok! Then from my side we can merge this one and lets work on those to-dos 🙂