Use Negative Binomial or Poisson to handle counts data?

wlad-svennik commented 7 years ago

There is a simple regression algorithm for counts data, called Poisson regression. This algorithm assumes that every regressor has a multiplicative effect. It's similar to computing the log of the data, except it works even when the data has zeroes.

It's conceivable that you could replace the Poisson distribution with the more general Negative Binomial distribution. The NB distribution is a generalization of the Poisson that allows the mean to be different from the variance. In contrast, in a Poisson distribution the mean is always the same as the variance.

The main difficulty with just changing the Normal distribution to a Negative Binomial is it's then necessary to add the constraint that $0 < \mu < \sigma^2$.

Does this seem like a good idea?

bletham commented 7 years ago

Stan has negative binomial parameterizations that don't require constraints, such as neg_binomial_2. I think this is a very reasonable thing to do. Basically the model is

y(t) ~ Normal(g(t) + X*beta, sigma**2)

where g(t) is the rate model. The alternative would then be

y(t) ~ NegativeBinomial(g(t) + X*beta, phi)

My only concern would be that if the count values are relatively small then we would likely need a lot of data to reasonably estimate seasonality. I think MCMC would be important here. On the other hand if counts are large, then a normal distribution probably provides a reasonable enough approximation. Worth doing though!

kidddw commented 5 years ago

I recently modified the underlying Stan model to assume a negative binomial distribution like that suggested above. My concern at the moment is that it appears that prophet scales the target values before fitting. This cannot be done when using such a distribution which assumes integer target values.

Are there any thoughts on the sensitivity of the prophet model to the scaling of the data? For instance, are parameters initialized under the assumption that the target values will be between 0 and 1?

bletham commented 5 years ago

Good point about the scaling for negative binomial link function.

There is hard-coded in the Stan a prior on the noise term of N(0, 1/2). This is rather weak for y \in [0, 1], but would probably need to be adjusted if the data were not scaled.

The priors on changepoint delta, seasonality beta, and holiday beta would also be scale-dependent. These can be directly set with changepoint.prior.scale, seasonality.prior.scale, and holiday.prior.scale inputs so no real difficulty there other than the default values might be really bad for unscaled data.

I'm pretty sure there isn't anything else that is scale dependent or assumes y <= 1.

One other thing to note is that the Gaussian link function is encoded in both the Stan and in the R/Py, so you'll need to also adjust the sample_model function (https://github.com/facebook/prophet/blob/master/R/R/prophet.R#L1559) to make predictions.

oren0e commented 5 years ago

Regarding the poisson link function - I thought it will be out in the 0.5 version. Do you know when you'll add this feature?

bletham commented 5 years ago

We were working on getting it in place with #865 but that got stuck by some perf issues in rstan for which we're still trying to figure out the best workaround, so unfortunately not yet.

sammo commented 5 years ago

Hi @bletham . Thank you for this amazing package called fbprophet!! It's moving the forecasting world to a new level. Question: are there any plans to get the negative binomial function into 0.6?

eromoe commented 5 years ago

@bletham

got stuck by some perf issues in rstan

Could we assume it is fine in pystan , if so, could you please release a workable python version for now ?

bletham commented 4 years ago

I was targeting for this to happen after #501, which would have made it easier / more generic, but after the addition of the cmdstanpy backend I've decided to no longer pursue #501 so this should just be done directly.

This should be able to look a lot like how there is currently a switch between the linear and logistic trends; here we would have a switch between link functions. The link function is defined in Stan, right here: https://github.com/facebook/prophet/blob/46e56119835f851714d22b285d2e4081853b9fb1/python/stan/unix/prophet.stan#L117 https://github.com/facebook/prophet/blob/46e56119835f851714d22b285d2e4081853b9fb1/python/stan/unix/prophet.stan#L124 So that would need a switch to alternatively use a NB/Poisson.

What is currently yhat in the rest of the codebase would now be the rate of that process - I feel like this is what we are really interested in anyway, so I don't think we'd need to make any changes to that. What would need to be changed is the uncertainty estimation. For that, the Gaussian link function shows up here: https://github.com/facebook/prophet/blob/46e56119835f851714d22b285d2e4081853b9fb1/python/fbprophet/forecaster.py#L1422-L1427 https://github.com/facebook/prophet/blob/46e56119835f851714d22b285d2e4081853b9fb1/R/R/prophet.R#L1579-L1583

And I believe that should be it. So code-wise, this should not require massive changes. The main questions that I have is around validation (checking on some realistic small-count datasets that this is doing something reasonable / making sure that the fitting doesn't fail / do we need NB or is Poisson sufficient?).

bletham commented 4 years ago

I just added a working negative binomial likelihood in #1544. It's a work in progress and will need some more validation, in particular we'll want to try it out on some real datasets. If you have real time series with count data that you can share, please post them on #1500. Thanks!

oren0e commented 4 years ago

@bletham I would be interested to contribute somehow to the effort of bringing these count-data likelihoods into the library since this was super important for me at my previous job. What can I do?

bletham commented 4 years ago

@oren0e sorry for the slow reply, and thanks for being willing to contribute! I'd like to get the NB likelihood in the package in the next couple weeks, and then do a version release at the end of the month. I have an initial sketch version in #1500 (PR is #1544). My main concern now is whether or not this will actually work on real data (I'm worried about numerical issues from the logit link function), so if you still have any real count-data time series that we could test this on, that'd be most useful. Beyond that, my PR is pretty bare bones. A review of #1544 would be great, especially to be sure I didn't mess things up when switching from the stan NB parameterization to the scipy parameterization, and then we'll have to see if there are any edge cases that PR isn't handling correctly, write tests, translate it to R, write documentation.

oren0e commented 4 years ago

@bletham sadly I don't have the data I was working on since it was left on my previous company's servers. I will do my best to review the PR, I hope it does not involve a lot of Stan syntax since I'm not familiar with it. Will add comments to the PR if I find something useful.

bletham commented 3 years ago

I implemented a NB likelihood in #1544. There were significant numerical issues around the hinge function that is required to convert the latent forecast into a positive process rate. Discussion of this is in #1500. As discussed there, I'm not very optimistic about the NB likelihood being broadly useful in Prophet due to these challenges. For the purposes of handling small-count data (especially when we're trying to get a forecast that stays positive), there are some much more robust approaches that are explored in #1668 that I think provide a better direction than a NB likelihood. So in light of the issues in my PR, this effort is deprioritized and probably won't ever make it into the package. Though interested individuals can of course patch in my PR and try it out!

facebook / prophet

Use Negative Binomial or Poisson to handle counts data? #337