facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.51k stars 4.53k forks source link

Add option to include all posterior predictive samples in predict.forecast results #174

Closed hanowell closed 7 years ago

hanowell commented 7 years ago

I want to use prophet to build probabilistic forecasts of transaction counts for a large set of metropolitan areas in the United States. Then I want to create an index based on the probabilistic forecasts that compares a given metro area to other metro areas or to forecasts at higher levels of geographic aggregation. The index needs to take into account the uncertainty of the individual metro-level time series. If the posterior samples were returned with predict.prophet instead of just lower and upper bounds (e.g., yhat_lower and yhat_upper), this would be much easier to do. It would be ideal for this to be optional since not everyone needs the posterior draws. Another way to do this would be to just @export prophet::predict_*-type functions. I can call ?prophet::predict_uncertainty and get a help file, but cannot use the function itself in the CRAN release.

hanowell commented 7 years ago

I think the east way to do this would be to export prophet::predict_uncertainty, right?

bletham commented 7 years ago

This seems like a great thing to add. Unfortunately in the meantime there isn't an easy way for you to access the distributions. You would need to replicate this part of the code: https://github.com/facebookincubator/prophet/blob/master/R/R/prophet.R#L786-L796 as the output of predict_uncertainty is already intervals. You can access the unexported functions like prophet:::sample_model().

hanowell commented 7 years ago

Here I see two possible scenarios until and if this enhancement is added.

  1. I use getNamespace() and attach() to temporarily attach the namespace of the prophet package (assuming R here)
  2. I fork your code and @export sample_model(), possibly adding a wrapper function (called tibble_samples()?) that coerces the list returned by sample_model() to a tibble.
sukwkim commented 7 years ago

Hello, @bletham and @BrashEQLibrium May I ask how's going on this? Because I also need to use posterior samples for my forecasting value. Currently, I am using prophet model for causal impact analysis and I need posterior samples ,when I check the difference between forecasted value with actual value. I can modify the original code but may I ask are there any progress regarding enriching posterior samples . And can I also participate this enrichment job ?

hanowell commented 7 years ago

Hey, @sukwkim, I haven't made any movements on this, but an ongoing project in my department would greatly benefit from this feature and I would be happy to test or participate in developing the feature if necessary. But if it's already under development for an upcoming release, :+1:

sukwkim commented 7 years ago

Thanks, @BrashEQLibrium . I will use the updated feature to my project but if you are OK, we can discuss together how we can add this feature and release it. I will share the progress soon :)

bletham commented 7 years ago

Thanks to @sukwkim for adding this feature. This is pushed to the v0.2 branch in https://github.com/facebookincubator/prophet/commit/995fda07a96c939e258647fc98852b4091a272f1 and https://github.com/facebookincubator/prophet/commit/19e95311c27bb89d430b6a236e12fda149c746f4. For anyone interested in this, clone and install from the v0.2 branch and then try it out: m.predictive_samples(future) in Python and predictive_samples(m, future) in R.

In R you can install the v0.2 branch using the devtools package:

devtools::install_github('facebookincubator/prophet', subdir='R', ref='v0.2')
sukwkim commented 7 years ago

Thanks @bletham . :)

1mike12 commented 7 years ago

Not a data scientist, so I sorry if this is coming off thick, but could anyone elaborate on how #238 relates to solving @BrashEQLibrium 's question? Looking at the PR and comments I honestly have no idea what's going on. I think I have a related problem, basically trying to generate a CDF of the sum of future values between two dates.

Is what is being returned in m.predictive_samples() usable to create a CDF?

sukwkim commented 7 years ago

Hi @1mike12 , May be you can solve the problem "From Jan1st to Jan30th, what's the probability the total # of events is over X?" by

is_post : it specifies from Jan 1 to Jan 30th`

result <- m.predictive_samples(m,future) sum(apply(result$yhat[is_post,],2, sum) > X)/dim(result$yhat)[2]

bletham commented 7 years ago

This is now available in v0.2 in CRAN and pypi. Described in the documentation here: https://facebookincubator.github.io/prophet/docs/uncertainty_intervals.html