alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.59k stars 234 forks source link

Huge difference between auto_arima with and without seasonality #525

Open tobiasderoos opened 2 years ago

tobiasderoos commented 2 years ago

Describe the question you have

Hi, I am forecasting several time series with auto arima, once without seasonality and once with seasonality. This is part of an automated algorithm which compares both forecasts and then decides which is the better forecast. The results for one time series is as follows:


 ARIMA(0,0,0)(0,0,0)[0] intercept   : AIC=1063.124, Time=0.00 sec
 ARIMA(1,0,0)(0,0,0)[0] intercept   : AIC=1065.064, Time=0.03 sec
 ARIMA(0,0,1)(0,0,0)[0] intercept   : AIC=1065.069, Time=0.06 sec
 ARIMA(0,0,0)(0,0,0)[0]             : AIC=1064.208, Time=0.00 sec
 ARIMA(1,0,1)(0,0,0)[0] intercept   : AIC=inf, Time=0.09 sec

Best model:  ARIMA(0,0,0)(0,0,0)[0] intercept
Total fit time: 0.235 seconds
Performing stepwise search to minimize aic
 ARIMA(0,1,0)(0,1,0)[12]             : AIC=895.902, Time=0.03 sec
 ARIMA(1,1,0)(1,1,0)[12]             : AIC=861.360, Time=0.50 sec
 ARIMA(0,1,1)(0,1,1)[12]             : AIC=inf, Time=0.56 sec
 ARIMA(1,1,0)(0,1,0)[12]             : AIC=879.416, Time=0.02 sec
 ARIMA(1,1,0)(2,1,0)[12]             : AIC=859.139, Time=4.96 sec
 ARIMA(1,1,0)(3,1,0)[12]             : AIC=857.942, Time=10.18 sec
 ARIMA(1,1,0)(3,1,1)[12]             : AIC=inf, Time=5.66 sec
 ARIMA(1,1,0)(2,1,1)[12]             : AIC=inf, Time=5.40 sec
 ARIMA(0,1,0)(3,1,0)[12]             : AIC=inf, Time=5.11 sec
 ARIMA(2,1,0)(3,1,0)[12]             : AIC=858.513, Time=3.30 sec
 ARIMA(1,1,1)(3,1,0)[12]             : AIC=inf, Time=4.75 sec
 ARIMA(0,1,1)(3,1,0)[12]             : AIC=inf, Time=3.49 sec
 ARIMA(2,1,1)(3,1,0)[12]             : AIC=inf, Time=17.40 sec
 ARIMA(1,1,0)(3,1,0)[12] intercept   : AIC=859.944, Time=8.52 sec

Best model:  ARIMA(1,1,0)(3,1,0)[12]
Total fit time: 739.036 seconds```

What I am wondering, why is the total fit time between the two so large as the total fit time of auto arima with seasonalitty is more than 700 seconds. This happens regularly, for other time series as well. 

### Versions (if necessary)

_No response_
james-l-wei commented 1 year ago

I agree - and when using even higher order seasonalities (e.g. 288 for data measured at 5-minute intervals), auto_arima hangs either 0 or 1 step into the computation (running overnight on very powerful hardware).