alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.56k stars 228 forks source link

long run time when set m=365 #102

Closed sattar-hazrati closed 5 years ago

sattar-hazrati commented 5 years ago

Description

I have one year daily data. and want to predict 30 days. for this work i set this parameters arima = pm.auto_arima(train, error_action='ignore', trace=1,seasonal=True, m=365) my problem is that very long run time for each model is it correct to set m= 365 for daily data?

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

tgsmith61591 commented 5 years ago

Hey Sattar! Thanks for filing an issue.

This is a known issue with seasonal ARIMAs with large m values. In our next release (v1.2.0) we will be adding new tests of seasonality which should help speed things up (see #88). Currently, seasonal differencing tests default to the CH (canova-hansen) test. We've got OCSB implemented and it will be the new default in v1.2.0, so you should see some speed-ups there.

That said, though, at the end of the day seasonal ARIMAs with large m are always going to run a bit slow. Since we delegate those model fits down to statmodels, there's not a ton that can be done for that... even R chokes on those models. :-(

If you'd like to build the bleeding edge and try your model with the OCSB test (default), you'll have to build the package from source until it's released:

$ git clone -b develop git@github.com:tgsmith61591/pmdarima.git
$ cd pmdarima
$ python setup.py develop
RahatKatal commented 5 years ago

Hi Sattar,

Did you find a work around long period of seasonality i.e. large value of m? I am also facing the same issue currently.

tgsmith61591 commented 5 years ago

Closing since #103 is the exact same, but with more detail