robjhyndman / forecast

Forecasting Functions for Time Series and Linear Models
http://pkg.robjhyndman.com/forecast
1.1k stars 340 forks source link

Application of `auto.arima()` on both `ts()` and `msts()` seasonal datasets #875

Closed englianhu closed 3 years ago

englianhu commented 3 years ago

1) Read Data

Firstly I filter the dataset.

## read and filter dataset
## Dataset has 7200 observations which is 7200 mins price per week
> timeID <- data_m1$index %>% 
+   as_date %>% 
+   unique %>% 
+   sort
> timeID %<>% .[. > as_date('2015-01-11')]
> dt <- timeID[1]
> 
> smp <- data_m1 %>% 
+     tk_xts(silent = TRUE)
> dt %<>% as_date
> smp <- smp[paste0(dt %m-% weeks(1) + seconds(59), '/', dt + seconds(59))]

2) Apply auto.arima() on both ts() and msts() seasonal dataset

2.1) ts() seasonal dataset

and then build a ts() seasonal dataset and apply auto.arima().

## build a `ts()` seasonal dataset and apply `auto.arima()`
> sarimats <- smp %>% 
+     tk_ts(frequency = 1440)
> 
> fit_ts <- auto.arima(Op(sarimats), D = 1, trace = TRUE)

2.2) msts() seasonal dataset

try to build a msts() seasonal dataset and apply auto.arima() but there prompt me an error which is not enough data to proceed.

> mts <- smp %>% 
+     msts(seasonal.periods = c(1440, 7200))

For my understanding, all observations are 7200 mins and msts(seasonal.periods = c(1440, 7200)) will not be able to be a nested seasonal dataset. Then I change from msts(seasonal.periods = c(1440, 7200)) to msts(seasonal.periods = c(60, 1440)) which is 60 mins per hour and 1440 mins per day to run 7200 mins.

## Dataset has 7200 observations which is 7200 mins per week
## due to not enough data to run, here I set as 60 mins and 1440 mins, therefore it can loop 5 days.
## build a ts() seasonal dataset and apply auto.arima()
#mts <- smp %>% 
#    msts(seasonal.periods = c(1440, 7200))
> sarimamsts <- smp %>% 
+  msts(seasonal.periods = c(60, 1440))

> fit_msts <- auto.arima(Op(sarimamsts), D = 1, trace = TRUE)

3) Comparison

Below I compare both models

> ## ---------------
> ## Below model use dataset where contain 7200 mins and forecast 1440 mins.
> fit_ts
Series: Op(sarimats) 
ARIMA(0,1,0)(0,1,0)[1440] 

sigma^2 estimated as 0.003333:  log likelihood=8251.72
AIC=-16501.45   AICc=-16501.44   BIC=-16494.79

> ## ---------------
> ## Below model use dataset where contain 7200 mins and forecast nested 60mins & 1440 mins.
> fit_msts
Series: Op(sarimamsts) 
ARIMA(0,1,0)(0,1,0)[1440] 

sigma^2 estimated as 0.003333:  log likelihood=8251.72
AIC=-16501.45   AICc=-16501.44   BIC=-16494.79

> ## ---------------
> attributes(fit_ts)
$names
 [1] "coef"      "sigma2"    "var.coef"  "mask"      "loglik"    "aic"      
 [7] "arma"      "residuals" "call"      "series"    "code"      "n.cond"   
[13] "nobs"      "model"     "bic"       "aicc"      "x"         "fitted"   

$class
[1] "forecast_ARIMA" "ARIMA"          "Arima"         

> ## ---------------
> attributes(fit_msts)
$names
 [1] "coef"      "sigma2"    "var.coef"  "mask"      "loglik"    "aic"      
 [7] "arma"      "residuals" "call"      "series"    "code"      "n.cond"   
[13] "nobs"      "model"     "bic"       "aicc"      "x"         "fitted"   

$class
[1] "forecast_ARIMA" "ARIMA"          "Arima"

4) Question

Eventually I used few hours to get the outcome. My questions are :

Here I also raised the question in Application of auto.arima() on both ts() and msts() seasonal datasets

robjhyndman commented 3 years ago

This is not a place for questions like this. Stackoverflow was the right choice here, but it would help if you provided a reproducible example.