robjhyndman / forecast

Forecasting Functions for Time Series and Linear Models
http://pkg.robjhyndman.com/forecast
1.12k stars 342 forks source link

Plot() method fails for forecasts with non-finite prediction intervals #160

Closed dashaub closed 9 years ago

dashaub commented 9 years ago

Consider the following time series

require(forecast)
testseries <- ts(c(16L, 17L, 21L, 21L, 39L, 85L, 17L, 
                     17L, 13L, 11L, 18L, 5L, 11L, 11L, 17L, 18L, 15L, 20L, 11L, 11L, 
                     14L, 15L, 16L, 12L, 12L, 18L, 17L, 15L, 14L, 13L, 13L, 20L, 14L, 
                     21L, 61L, 114L, 19L, 10L, 12L, 11L, 19L, 13L, 12L, 13L, 13L, 
                     14L, 3L, 10L, 20L, 10L, 20L, 16L, 12L, 12L, 18L, 8L, 15L, 13L, 
                     15L, 6L, 16L, 24L, 9L, 14L, 14L, 8L, 14L, 12L, 18L, 10L, 13L, 
                     22L, 10L, 14L, 13L, 13L, 9L, 8L, 5L, 15L, 9L, 16L, 13L, 7L, 15L, 
                     14L, 8L, 14L, 12L, 22L, 9L, 18L, 18L, 19L, 12L, 18L, 17L, 9L, 
                     14L, 12L, 18L, 14L, 13L, 19L, 14L, 15L, 13L, 16L, 11L, 4L, 16L, 
                     10L, 9L, 18L, 22L, 12L, 10L, 20L, 18L, 22L, 23L, 18L, 19L, 7L, 
                     28L, 30L, 15L, 18L, 21L, 16L, 13L, 27L, 17L, 16L, 20L, 24L, 14L, 
                     16L, 13L, 13L, 7L, 19L, 27L, 21L, 9L, 8L, 19L, 21L, 19L, 27L, 
                     21L, 14L, 27L, 30L, 32L, 33L, 24L, 21L, 16L, 27L, 21L, 32L, 25L, 
                     23L, 23L, 18L, 32L, 12L, 27L, 22L, 19L, 15L, 21L, 21L, 28L, 19L, 
                     19L, 28L, 17L, 13L, 32L, 26L, 30L, 21L, 27L, 26L, 20L, 21L, 22L, 
                     32L, 25L, 23L, 17L, 12L, 17L, 14L, 16L, 14L, 18L, 21L, 7L, 20L, 
                     16L, 18L, 18L, 12L, 23L, 17L, 12L, 25L, 21L, 20L, 15L, 24L, 14L, 
                     26L, 18L, 23L, 22L, 26L, 25L, 11L, 21L, 20L, 22L, 19L, 28L, 26L, 
                     13L, 17L, 21L, 14L, 16L, 19L, 12L, 7L, 18L, 22L, 22L, 19L, 23L, 
                     21L, 11L, 20L, 28L, 26L, 19L, 20L, 15L, 8L, 24L, 26L, 31L, 30L, 
                     31L, 18L, 15L, 21L, 19L, 21L, 15L, 21L, 26L, 12L, 27L, 19L, 19L, 
                     16L, 19L, 21L, 15L, 23L, 13L, 16L, 23L, 22L, 28L, 23L, 22L, 26L, 
                     25L, 29L, 30L, 19L, 11L, 16L, 13L, 19L, 25L, 19L, 17L, 4L, 17L, 
                     13L, 20L, 15L, 18L, 20L, 14L, 16L, 18L, 17L, 13L, 20L, 12L, 12L, 
                     19L, 14L, 27L, 24L, 20L, 10L, 9L, 19L, 16L, 14L, 15L, 14L, 27L, 
                     15L, 23L, 19L, 15L, 16L, 22L, 15L, 19L, 23L, 19L, 17L, 0L, 24L, 
                     24L, 10L, 23L, 19L, 16L, 18L, 18L, 24L, 9L, 19L, 18L, 9L, 17L, 
                     13L, 19L, 11L, 23L, 16L, 18L, 21L, 18L, 19L, 11L, 12L, 16L, 14L, 
                     0L, 15L, 27L, 10L, 11L, 15L, 11L, 8L, 20L, 18L, 17L, 22L, 18L, 
                     17L, 16L, 20L, 19L, 15L, 13L, 14L, 12L, 15L, 22L, 15L, 7L, 20L, 
                     18L, 17L, 10L, 12L, 12L, 10L, 16L, 14L, 22L, 17L, 19L, 12L, 149L, 
                     34L, 19L, 34L, 19L, 20L, 18L, 7L, 18L, 12L, 17L, 10L, 20L, 10L, 
                     21L, 41L, 17L, 18L, 23L, 15L, 17L, 12L, 25L, 21L, 15L, 17L, 18L, 
                     27L, 18L, 15L, 22L, 27L, 17L, 26L, 15L, 16L, 22L, 12L, 15L, 21L, 
                     15L, 24L, 12L, 19L, 20L, 15L, 14L, 16L, 12L, 12L, 29L, 15L, 20L, 
                     14L, 19L, 29L, 10L, 18L, 23L, 25L, 22L, 21L, 19L, 18L, 20L, 26L, 
                     18L, 19L, 22L, 15L, 10L, 22L, 15L, 16L, 22L, 20L, 15L, 13L, 22L, 
                     21L, 13L, 15L, 25L, 21L, 13L, 19L, 22L, 10L, 21L, 23L, 26L, 13L, 
                     14L, 21L, 21L, 20L, 16L, 15L, 8L, 14L, 19L, 14L, 19L, 19L, 21L, 
                     16L, 20L, 26L, 23L, 20L, 18L, 22L, 15L, 18L, 17L, 28L, 23L, 28L, 
                     22L, 7L, 26L, 30L, 30L, 20L, 23L, 12L, 12L, 31L, 33L, 33L, 19L, 
                     24L, 19L, 20L, 26L, 28L, 22L, 29L, 35L, 22L, 22L, 27L, 27L, 36L, 
                     30L, 19L, 30L, 11L, 24L, 25L, 25L, 31L, 26L, 17L, 16L, 23L, 24L, 
                     28L, 25L, 34L, 24L, 18L, 25L, 21L, 19L, 22L, 19L, 16L, 14L, 19L, 
                     22L, 23L, 21L, 18L, 21L, 10L, 23L, 13L, 20L, 24L, 22L, 30L, 15L, 
                     24L, 22L, 24L, 24L, 27L, 16L, 19L, 18L, 19L, 24L, 20L, 25L, 23L, 
                     15L, 22L, 28L, 24L, 25L, 23L, 19L, 21L, 20L, 33L, 31L, 25L, 27L, 
                     26L, 16L, 30L), f = 7)

This series could be fit by an ARIMA model nicely, but it has several outliers

plot(testseries) # Several outliers

Regardless, if we fit it as is, we get a selected model and finite prediction intervals

fitarima <- auto.arima(testseries)
# ARIMA(1,1,1)(1,0,0)[7]
fitarima

# Finite prediction intervals
forecast(fitarima)

We might try a Box-Cox transform to stabilize the variance, and using BoxCox.lambda() we will get a selected value near 0. However, when we forecast with this model the prediction intervals are not finite, so the plot() method fails.

# Model with only intercept and Box-Cox transform with lamba near 0
fitarimabc <- auto.arima(testseries, lambda = BoxCox.lambda(testseries))
fitarimabc

# Prediction intervals are lower 0, upper Inf for all levels
forecast(fitarimabc)

# plot method fails since plot() can't handle non-finite values
plot(forecast(fitarimabc)) # error

# We can't do a log transform or lambda = 0 since the series has zeroes
which(testseries == 0) # 331 359

There is obviously an issue with the prediction intervals, and there could be several ways to solve it (e.g. add one and do a log or Box-Cox transform), but for now I'm addresing the failure of the plot() method for when the prediction intervals contains Inf. I'd propose solving this by creating a check in plot.forecast() for if the prediction intervals are finite. If they are not, the method would not attempt to plot them and would instead merely plot the point estimitate, similar to how plot(forecast(nnetar(testseries))) functions without the prediction intervals. A warning message for when forecast() does not generate finite prediction intervals might be nice too.

dashaub commented 9 years ago

I haven't tested it, but I'd imagine there would be a similar problem if the prediction intervals contain NaN. For example, this fails with the same error

 plot(wineind, ylim = c(0, NaN)) #error

This scenario is possible if the time series has very large numbers and the selected model has an increasing trend component..