lavinei / pybats

Bayesian time series forecasting and decision analysis
https://lavinei.github.io/pybats/
Apache License 2.0
109 stars 26 forks source link

Negative binomial in addition #4

Open arita37 opened 3 years ago

arita37 commented 3 years ago

Can you add negative binomial distribution in addition of poisson ? Here, some

https://minimizeregret.com/post/2018/01/04/understanding-the-negative-binomial-distribution/

https://kth.diva-portal.org/smash/get/diva2:1249681/FULLTEXT01.pdf

thanks

lavinei commented 3 years ago

Hi,

A detail hidden within the Poisson DGLM is that the forecast distribution is negative binomial. A full discussion is given in this book, Chapter 14.

Briefly, the amount of dispersion in the forecast distribution increases with the uncertainty in the DGLM coefficients. As noted in the documentation, the forecast distribution is:

$$y_t \sim Pois(\mu_t)$$

However, there is a prior distribution on $\mu_t$: $$\mu_t \sim Ga(\alpha_t, \beta_t)$$

If you integrate out $\mu_t$, then marginally the forecast distribution becomes negative binomial.

On a practical note, if you are working with data that has greater than Poisson dispersion, you can also use the hyperparameter $\rho$. $0 \lt \rho \lte 1$, and the smaller $\rho$ becomes, the larger the forecast variance. You can set this in the analysis function, or when you define a Poisson DGLM yourself, with rho=0.5, or some other value.

Thanks, Isaac

arita37 commented 3 years ago

Thanks.

Why it is called Poisson if this is negative binomial ?

On May 17, 2021, at 5:15, lavinei @.***> wrote:

 Hi,

A detail hidden within the Poisson DGLM is that the forecast distribution is negative binomial. A full discussion is given in this book, Chapter 14.

Briefly, the amount of dispersion in the forecast distribution increases with the uncertainty in the DGLM coefficients. As noted in the documentation, the forecast distribution is:

$$y_t \sim Pois(\mu_t)$$

However, there is a prior distribution on $\mu_t$: $$\mu_t \sim Ga(\alpha_t, \beta_t)$$

If you integrate out $\mu_t$, then marginally the forecast distribution becomes negative binomial.

On a practical note, if you are working with data that has greater than Poisson dispersion, you can also use the hyperparameter $\rho$. $0 \lt \rho \lte 1$, and the smaller $\rho$ becomes, the larger the forecast variance. You can set this in the analysis function, or when you define a Poisson DGLM yourself, with rho=0.5, or some other value.

Thanks, Isaac

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

lavinei commented 3 years ago

The model name corresponds to the observation equation conditioned on knowing all the parameters. For a Poisson DGLM, when you condition on $mu_t$, then the forecast distribution is Poisson.

Full details on DGLM theory is given in this book, Chapter 14. Additionally, here is the Wikipedia article on the Gamma-Poisson mixture.

arita37 commented 3 years ago

Thanks vm for the reference.

Just wondering, a natural extension would be Mixture Density network:

https://cedar.buffalo.edu/~srihari/CSE574/Chap5/Chap5.6-MixDensityNetworks.pdf

It can be done using some existing library. What so you think ?

On May 17, 2021, at 11:26, lavinei @.***> wrote:

 The model name corresponds to the observation equation conditioned on knowing all the parameters. For a Poisson DGLM, when you condition on $mu_t$, then the forecast distribution is Poisson.

Full details on DGLM theory is given in this book, Chapter 14. Additionally, here is the Wikipedia article on the Gamma-Poisson mixture.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

lavinei commented 3 years ago

Thanks for linking to the Mixture Density network, it looks like a very interesting model and good research. However, the goal of the Mixture Density network seems quite different than the goals of PyBATS. PyBATS is primarily focused on time series, especially computationally efficient models that can be run quickly. I appreciate the suggestion, but I'm not sure that I see them directly connecting.

arita37 commented 3 years ago

Thanks for comments

MDN is more designed to handle “complex” distribution (ie outside of direct analytical formulae) at the expense of training cost....

Agree and understand the computational part Think there are some mdn implementation which might be fast and quasi-linear.

Anyway thanks for pyBATS, thats good.

There is also a package called Bayesian Structural Time Series (from Google), Is there any overalap of methodologies ?

Thx

On May 18, 2021, at 9:47, lavinei @.***> wrote:

 Thanks for linking to the Mixture Density network, it looks like a very interesting model and good research. However, the goal of the Mixture Density network seems quite different than the goals of PyBATS. PyBATS is primarily focused on time series, especially computationally efficient models that can be run quickly. I appreciate the suggestion, but I'm not sure that I see them directly connecting.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.