LAMPSPUC / StateSpaceModels.jl

StateSpaceModels.jl is a Julia package for time-series analysis using state-space models.
https://lampspuc.github.io/StateSpaceModels.jl/latest/
MIT License
271 stars 25 forks source link

Implement the AutoARIMA algorithm #234

Closed guilhermebodin closed 3 years ago

azev77 commented 3 years ago

@guilhermebodin great idea! Please benchmark it w/ ARCHModels.jl

using ARCHModels;
auto_arma(df, ic) = selectmodel(ARCH{0}, df, meanspec=ARMA, criterion=ic, minlags=0, maxlags=10);
guilhermebodin commented 3 years ago

Hi @azev77 I am not so sure about how to do a proper benchmark

using StateSpaceModels, ARCHModels
serie = randn(1000)
@time am = selectmodel(ARCH{0}, serie;  meanspec=ARMA);
loglikelihood(am)
aic(am)
@time model = auto_arima(serie)
loglike(model)
model.results.aic

In this simple example curiously ARCHModels.jl tends to choose p and q greater than 0 every time. ARCHModels.jl is certainly faster, probably they use a different estimation method.

After testing in BG96, StateSpaceModels.NILE and StateSpaceModels.INTERNET I came to the conclusion that even with very similar models they usually converge to different log-likelihoods.

Here I show an example

using StateSpaceModels, ARCHModels, CSV, DataFrames
internet = CSV.File(StateSpaceModels.INTERNET) |> DataFrame
serie = internet.dinternet[2:end]
@time am = fit(ARCH{0}, serie, meanspec=ARMA{1,1})
loglikelihood(am)
@time begin
    model = SARIMA(serie; order = (1,0,1), include_mean = true)
    fix_hyperparameters!(model, Dict("ar_L1" => 0.631033, "sigma2_η" => 10.1978, "ma_L1" => 0.501191, "mean" => 0.60756))
    StateSpaceModels.handle_optim_initial_hyperparameters(model)
    loglike(model)
end
azev77 commented 3 years ago

@s-broda maybe knows why

s-broda commented 3 years ago

Venturing a guess, it might be due to different handling of presample values? ARCHModels just uses the sample variance and sample mean to start the recursion.

guilhermebodin commented 3 years ago

@s-broda does ARCHModels.jl use a Kalman Filter to estimate the hyperparameters?

s-broda commented 3 years ago

No.

guilhermebodin commented 3 years ago

Probably that is why it is faster.

azev77 commented 3 years ago

Btw, I strongly believe proper hyper-tuning should be based on out of sample fit

guilhermebodin commented 3 years ago

This is an interesting discussion. In StateSpaceModels.jl we have a benchmark function that evaluates the skill of probabilistic forecasts generated by the models.

guilhermebodin commented 3 years ago

I am closing the issue but we can continue the discussion here.