alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.57k stars 231 forks source link

Is it possible to get prediction intervals (not confidence intervals)? #489

Closed zora-no closed 2 years ago

zora-no commented 2 years ago

Describe the question you have

After some time searching the source code, I figured out that the confidence interval that can be returned with test_pred, confint = model.predict(..., return_conf_int=True) is the confidence interval of the fitted parameters. Is there also a way to get the prediction intervals, i.e. "A prediction interval is an interval associated with a random variable yet to be observed, with a specified probability of the random variable lying within the interval. For example, I might give an 80% interval for the forecast of GDP in 2014. The actual GDP in 2014 should lie within the interval with probability 0.8. Prediction intervals can arise in Bayesian or frequentist statistics." (https://robjhyndman.com/hyndsight/intervals/)

Versions (if necessary)

No response

harisdtu commented 2 years ago

I'm pretty sure that what you get is prediction intervals, even if they are called confidence intervals. There is no interval for any "fitted parameter" when you call predict because the documentation also says they refer to the forecasts, not any parameters.

If my interpretation is correct, I would suggest renaming confidence to prediction.

fkiraly commented 2 years ago

Hmm - slightly troubled by reading this.

There are two related, but often confused concepts here:

To add to the confusion, you also have Bayesian predictive intervals which are not the same as Bayesian posteriors on the predictive mean, so not everything Bayesian is a confidence interval.

(Note, @harisdtu, confidence intervals do not need to be on fitted model parameters - they can be on a prediction statistic, too! So it's not obvious from the context what this is!)

Imo we may like to be cautious here and say "oh, this will be predictive intervals for sure", but, perhaps, check? Would be bad if not, since in sktime we've just cleaned up some interfaces and are interfacing the pmdarima returns as predictive intervals.

+1 to the question on clarifying this.

tgsmith61591 commented 2 years ago

Hey all. The intervals you're referring to are produced by a method called conf_int in statsmodels. Although statsmodels itself does not explicitly document this (that I can find), my understanding of their API makes me think what we are getting back are actually confidence intervals.

Take this quick (imperfect) example:

import pmdarima as pm

y = pm.datasets.load_wineind()
mod = pm.ARIMA((1, 0, 2)).fit(y)

# private API to directly fetch raw forecasts
start = len(y)
end = start + 5

results = mod.arima_res_.get_prediction(
    start=start,
    end=end,
    exog=None,
)

# we can directly observe the results summary frame:
results.summary_frame()
# y          mean      mean_se  mean_ci_lower  mean_ci_upper
# 0  24400.113989  5155.787408   14294.956357   34505.271621
# 1  25635.745117  5295.188186   15257.366982   36014.123252
# 2  25405.766858  5327.046350   14964.947868   35846.585848
# 3  25387.323173  5327.250634   14946.103794   35828.542551
# 4  25385.844035  5327.251947   14944.622082   35827.065989
# 5  25385.725412  5327.251956   14944.503442   35826.947382

Their linear model API (e.g., OLS) computes two additional columns: [obs_ci_lower, obs_ci_upper] which can be interpreted as prediction intervals, which confirms to me that the [mean_ci_lower, mean_ci_upper] are the confidence interval.

However, the way they're accomplishing this is with a slightly different function signature for conf_int in the linear model:

def conf_int(self, obs=False, alpha=0.05):
    ...

If obs is true, they use se_obs rather than se_mean to compute the interval:

    @property
    def se_obs(self):
        return np.sqrt(self.var_pred_mean + self.var_resid)

        ...

    def conf_int(self, obs=False, alpha=0.05):
        se = self.se_obs if obs else self.se_mean

        q = self.dist.ppf(1 - alpha / 2., *self.dist_args)
        lower = self.predicted_mean - q * se
        upper = self.predicted_mean + q * se
        return np.column_stack((lower, upper))

As far as I can tell, we're missing the equivalent in the SARIMAXResultsWrapper to be able to obtain the prediction interval. But we are treading deep into statsmodels territory here so I opened up an issue on their side to get a definitive answer.

cc: @fkiraly @zora-no

tgsmith61591 commented 2 years ago

See Chad Fulton's answer on the statsmodels board here and a more in-depth discussion on their mailing list.

tl;dr:

Ultimately, the intervals produced by either SARIMAX (python) or Arima (R) don't fit either of the definitions above. In some sense they are more like the "Prediction interval" term, because they do take into account the uncertainty arising from the error term (unlike the "Confidence interval" as described above). But it is not an exact match because they don't take into account parameter estimation uncertainty.

fkiraly commented 2 years ago

Well, I missed the discussion over the week-end, but I would guess these are prediction intervals then? There is a lack of definition for the terms, unfortunately, and I also think there is some confusion around the concepts in the entire discussion, but I'm satisfied that, from an interface perspective, the intention seems to be "prediction interval" according to the definition I used in this notebook: https://github.com/sktime/sktime-tutorial-pydata-berlin-2022/blob/main/notebooks/3_hierarchical_global_forecasting.ipynb