robjhyndman / forecast

Forecasting Functions for Time Series and Linear Models
http://pkg.robjhyndman.com/forecast
1.11k stars 342 forks source link

Bats/Tbats strange forecasts #367

Closed Alexej36 closed 6 years ago

Alexej36 commented 8 years ago

While working with bats/tbats models, I found some strange behavior I want to report.

My understanding of bats/tbats is the following:

  1. One can use bats/tbats models also with non seasonal data. With non seasonal data, tbats models reduce to bats models.
  2. If use.damping=FALSE, then for non seasonal data bats models reduce to ETS models (provided BoxCox and arima errors are disabled)
  3. If use.damping=TRUE, then the long term forecast trend approaches the average trend of the data and not 0, in contrast to ETS damped models (provided BoxCox and arima errors are disabled)

Based on my understanding (1.-3.) there are 4 issues I want to clarify:

  1. For non seasonal data, the slope of the forecast should be constant if there is no damping in the model. Well, it is not - the slope in the beginning (forecast month 1 and 2) is different from the slope of the remaining forecast - see Issue 1 and Issue 2
  2. The slope of the forecast should go to the average slope of the data - it does not - see Issue 3
  3. If the data is falling, I would expect the forecast to fall as well. In Issue 4 the forecast is going up while the data is falling.

library(forecast)
my_data<-c(0.88,0.88, 0.87, 0.88, 0.95, 0.93, 0.90, 0.96, 0.93, 0.88, 0.90, 0.94, 1.01, 1.02, 1.01, 1.0,
 1.03, 1.05, 1.02, 1.04, 1.04, 1.01, 1.01, 1.02, 1.02, 1.05, 1.07, 1.04, 1.05, 1.03, 1.03, 1.02,
 1.04, 1.04, 1.06, 1.06, 1.01, 0.98, 0.99, 0.98, 0.97, 0.94, 0.92, 0.92, 0.89, 0.84, 0.79, 0.76,0.72,0.70,0.65)

######## Issue 1 - non constant forecast if use.trend=FALSE

  my_fc1 <- forecast(bats(my_data,use.box.cox = FALSE,use.trend = FALSE,use.damped.trend = FALSE,use.arma.errors = FALSE),h=20)

  ##if it is basically an ANN ETS Model, why does the forecast first go down?

  plot(my_fc1)

###############

####### Issue 2 - non constant forecast slope if use.trend = TRUE and use.damped.trend = FALSE

  my_fc2 <- forecast(bats(my_data,use.box.cox = FALSE,use.trend = TRUE,use.damped.trend = FALSE,use.arma.errors = FALSE),h=20)

  ##if it is basically an AAN ETS Model, why does the slope of the FC change in the beginning?
  plot(my_fc2)

################

####### Issue 3 long term slope not equal the average slope if use.trend = TRUE and use.damped.trend = TRUE

  my_fc3 <- forecast(bats(my_data,use.box.cox = FALSE,use.trend = TRUE,use.damped.trend = TRUE,use.arma.errors = FALSE),h=200)

  ## long term forecast slope:
  long_term_slope <- my_fc3$mean[200]-my_fc3$mean[199]

  ## average slope from regression
  regr<-(1:length(my_data))
  fit <- lm(my_data ~ regr)
  regr_slope <- fit$coefficients[2]

  ## the long term forecast slope is not equal to the averge slope from regression:
  long_term_slope - regr_slope
##########

########## Issue 4 strange forecast from tbats model:
  my_fc4 <- forecast(forecast:::fitSpecificTBATS(my_data[1:48], use.box.cox=FALSE, use.beta=TRUE,  seasonal.periods=c(6),use.damping=FALSE,k.vector=c(2)),h=200)

  ## why does the forecast go up although the whole data is falling?

  plot(my_fc4,type="l")
robjhyndman commented 8 years ago

1 & 2 were, sadly, a bug. Now fixed in https://github.com/robjhyndman/forecast/commit/50196ebda043dab176ea677060a9659d502cbf4c

I think 3. is just two different estimates of the slope based on different optimization criteria.

Not sure about 4. Will take a look.

ryninho commented 7 years ago

This yielded some pretty dramatic results in forecast v7.3:

library(forecast)
packageVersion('forecast')

Y <- c(1.060283, 1.009953, 20.183244, 8.032572, 10.408715)

y <- ts(Y)
plot(y)

mdl <- tbats(y)

pred <- predict(mdl)

pred

plot(pred)

Attached is a PDF showing the results tbats_bug_7_3.pdf

robjhyndman commented 7 years ago

This is not a bug. It's just a bad model obtained from 5 observations.

ryninho commented 7 years ago

Rob, thank you for the quick reply. Agreed that the sample size is too small for a good model- I’m mass producing forecasts daily and there are a variety of series inputs in terms of length and behavior. Right now I'm exploring alternate forecasts using various methods and I was looking for one to serve as a “one-stop shop” that would produce models as good as the input data for each series, with more conservative forecasts for shorter/wilder series. Is there a better alternative to TBATS in that case? Apart from doing my own if-then on algorithm selection based on the input series, or using hierarchical time series which I suppose might help?

Thank you!

robjhyndman commented 7 years ago

ets is pretty good provided you have non-seasonal data, or seasonal data with one type of seasonality and seasonal frequency no more than 24. It is also much faster than tbats.

mitchelloharawild commented 6 years ago

Issue 4 in the original bug report appears to behave better now.