Open sarah2397 opened 3 years ago
Heya, I'm not 100% confident in the modelling details (Ben Letham can probably provide a much better answer here), but the model structure is described here: https://github.com/facebook/prophet/blob/0616bfb5daa6888e9665bba1f95d9d67e91fed66/python/stan/unix/prophet.stan#L127-L142
Basically we have y
, the observed the data, and we're fitting it to a Normal distribution, with mu = trend + seasonality + regressors
, and sd = sigma_obs
. sigma_obs
is fitted to the data and I think that's what you're referring to as the noise term. For MAP estimation (i.e. mcmc_samples = 0
), there will just be a single value for sigma_obs
(which you can access in model.params['sigma_obs']
), and this represents the average variability of a given datapoint.
For the predicted value yhat
, we don't actually use this sigma term, because yhat
represents the expected value of each future data point. i.e. yhat ~ Normal(trend + seasonality + regressors, sigma_obs)
, but the expected value of this is just trend + seasonality + regressors
.
sigma_obs
does get used when we do uncertainty estimation. We would sample from the normal distribution described above, so sigma_obs affects how widely those samples can range. You can see the code for this here: https://github.com/facebook/prophet/blob/0616bfb5daa6888e9665bba1f95d9d67e91fed66/python/prophet/forecaster.py#L1477-L1483
@tcuongd awesome explanation, I was just going to add that the noise term is included in the yhat_upper
and yhat_lower
columns in the forecast dataframe, and by extension it is part of the shaded uncertainty region that you see in the m.plot()
visualization.
Thank you very much, this was very helpful!
So in the forecast dataframe, i have yhat_upper and yhat_lower, but I cannot see the exact value for sigma_obs, which is used to calculated in the bounds of the confidence intervals? We just sample from the normal distribution and we will include these sigma_obs values in the uncertainty intervals, right?
If I understand it right, for yhat itself, we just use one fix sigma_obs. So in this case, it's not relevant to figure out the exact value for me.
We just sample from the normal distribution and we will include these sigma_obs values in the uncertainty intervals, right?
Yep that's correct for yhat_lower
and yhat_upper
:)
for yhat itself, we just use one fix sigma_obs
yhat
doesn't actually rely on sigma_obs
at all, since yhat
is just the mean of the distribution (so yhat = trend + seasonality + holidays
).
What is "y_scale" in the snippet above? Is there a way we can tell prophet not to standardize 'y' ?
Dear all,
I noticed that Prophet uses a decomposable time series model, which includes a trend, saisonality, holiday factors and whatever additional regressor, you want to use. But there is (of course) also a noise term. So is this noise term stochastic oder determinstic? How is the error term calculated? Can I visualize or calculate the error term in my model by myself?
I couldn't find some information about this part of the model and hope you can help me at this point.
Best regards and thank you!