Working with data of many 0s and 1s and very few 2,3 and 4s

Vidyapreethaam commented 5 years ago

I am trying to forecast daily sales for a product and the demand value is just 0s and 1s(majorly). If i try to use prophet, then Im getting values that are decimals, is there any work around this while using prophet.

APramov commented 5 years ago

@Vidyapreethaam, I don't see an easy way to do that. I think that the assumed data generating process is continuous, as the error term is assumed to be normally distributed.

I guess it might be possible to do something if you change the distribution of the error term to be something else that fits your DGP better (a poisson whose lambda you model as a function of time in the same manner as the mean is modeled in Prophet? idk...), but frankly I am not sure how the whole structure would hold in your case that seems to be a (zero-inflated?) count process. I'd be really curious to hear @bletham 's take on this, maybe I am missing something. I don't have direct experience with such time series.

In the meantime, you might want to take a look at literature and models for time series analysis for (zero-inflated?) count processes.

amw5g commented 5 years ago

I agree with @APramov that you're going to have a tough time. And perhaps a GAM model, such as prophet uses, isn't the right fit. I'd look into negative binomial or ZIP, as suggested

That said, if you want to persist, you could aggregate your daily numbers up to a higher time frequency, like weekly or biweekly. Learn the prophet model. Then subsequently re-distribute the aggregated counts back down to the daily level. That last step you'll need to do outside prophet, based on your understanding of the domain.

bletham commented 5 years ago

Agreed, the Prophet model will be a bad fit for these data since it uses a Gaussian likelihood. Ultimately we want to support a negative binomial or Poisson likelihood, where basically the Prophet model would represent the latent arrival rate (#337), but it's proven a bit tricky and so you'll have to do something else for now.

I think @amw5g's idea of aggregating to learn seasonality is really interesting, for instance if there is yearly seasonality then aggregation to weekly data would allow you to capture yearly seasonality with Prophet, and then use that in combination with some other model like @amw5g suggests.

trevor-pope commented 5 years ago

@Vidyapreethaam You could consider using something similar to Croston's Method, which is used to forecast series with intermittent demand (i.e. long, but consistent, periods of zeroes following positive integers). Read more about it here

APramov commented 5 years ago

Agreed, the Prophet model will be a bad fit for these data since it uses a Gaussian likelihood. Ultimately we want to support a negative binomial or Poisson likelihood, where basically the Prophet model would represent the latent arrival rate (#337), but it's proven a bit tricky and so you'll have to do something else for now.

I think @amw5g's idea of aggregating to learn seasonality is really interesting, for instance if there is yearly seasonality then aggregation to weekly data would allow you to capture yearly seasonality with Prophet, and then use that in combination with some other model like @amw5g suggests.

@bletham , regarding your first point here - I have looking into using Prophet for count data and wanted to see if I can contribute. Is there an overview somewhere of what you guys have done so far/what's left to do on this issue? Cheers

Vidyapreethaam commented 5 years ago

Thank you for all those inputs. In regard to this, I have another doubt, so when I use Prophet, and I do it in a weekly level, then the yhat value are as decimals. So, for the right forecasted value do we round them up? as, we can have forecasts for demand as decimals?

Vidyapreethaam commented 5 years ago

@Vidyapreethaam You could consider using something similar to Croston's Method, which is used to forecast series with intermittent demand (i.e. long, but consistent, periods of zeroes following positive integers). Read more about it here

bletham commented 5 years ago

@APramov the difficulty in changing the likelihood is that the model is specified in two places in the code: it is specified first in the Stan, but then repeated in the R/Py code. That is because model fitting in done in Stan, but then predictions are done purely in R/Py. So making a change to the likelihood to add a new likelihood mode would require implementing this new likelihood model three times: in each of Stan, R, and Py. And the same goes for other customizations that want to be done, like tweaking the trend model.

So the plan for handling this and other new model modes is to change to make model predictions using Stan, so that a change in the model need be implemented only once and we can more easily accomadate a larger number modes. The typical workflow in Stan is to do predictions at the time of model fitting, which seemed too limiting for us and is not the typical API for software of this sort. Stan did not have the possibility to make predictions from a previously fitted model until 2.18, which added the standalone generated quantities feature. However as of the last time I looked into it (early summer), that feature had not yet propagated to rstan/pystan.

So that's the core of the challenge: making predictions from a previously fitted Stan model, despite this not being supported in rstan or pystan. We developed a trick for doing this in #865, where basically at predict time we redo the Stan inference but initialize the optimizer from the fitted values and do only 1 gradient step. This gets Stan to execute the Generated Quantities block, which is where we put the predict code. We actually got this working for producing yhat and its _lower and _upper bounds, but we found that it was very slow for producing posterior samples (needed for the predictive_samples function). The issue seemed to be in data transfer from CmdStan into R. But the person working on it wasn't able to dig more deeply into it, and I thought it best to wait a little and see if the standalone generated quantities feature would make it down to rstan/pystan before investing a lot more effort into this hack.

Standalone GQ did make it into rstan in the latest release, mid summer. So the next step to getting this working is to trying moving predictions into GQ as in #865, but then use the standalone GQ functionality to execute it. I plan to try this out over the next 6 weeks or so and see if I can get it working this time with the new rstan. If it does then adding the new likelihood will be fairly straightforward. If not, then we'll have to decide a new path because a discrete likelihood is definitely the next major feature on the todo list.

If you were interested in working on this feature, then you could take a look at #865, and like I said first step is just to adapt that to use the new standalone GQ in rstan 2.19. I expect it to be a fairly substantial effort, but we could discuss it more on #501 if you wanted to try it out. In any case, I'll be working on it too, and once it's working or not I'll comment on #337 and if you wanted to just implement the new likelihood model that'd be a great contribution too.

facebook / prophet

Working with data of many 0s and 1s and very few 2,3 and 4s #1153