sktime / sktime

A unified framework for machine learning with time series
https://www.sktime.net
BSD 3-Clause "New" or "Revised" License
7.75k stars 1.32k forks source link

[ENH] global ARIMA #5021

Open fkiraly opened 1 year ago

fkiraly commented 1 year ago

We should add a GlobalARIMA forecaster, which is the ARIMA family global forecaster. Originally requested by @olerch in https://github.com/sktime/sktime/discussions/5006. Related issue: https://github.com/sktime/sktime/issues/4651

ARIMA can be used as a global forecaster in the following way:

This is different from the current behaviour of ARIMA, which will fit individual ARIMA-s per instance.

As far as I know, this is not available in any of the "usual suspect" packages statsmodels, statsforecast, pmdarima, or elsewhere - except possibly in the very special case of fitting to a single time series instance.

Technically, this should not be too hard to implement, leveraging already existing functionality such as (possibly penalized/regularized) log-likelihoods implemented in statsmodels, and optimizing by SGD or a similar technique.

This issue is a good first issue for more statistics or data science oriented contributors. There are multiple ways to resolve this:

The estimator implementation/interfacing guide is here: https://www.sktime.net/en/stable/developer_guide/add_estimators.html

NguyenChienFelix33 commented 9 months ago

hello professor, can i work on this.

fkiraly commented 9 months ago

Sure!

The first question on this issue would be one of literature research - is there an interfaceable implementation of global ARIMA out there? If yes, the issue would be pretty straightforward to interface. Though I do not know any such instance.

If not, then this becomes much more methodologically involved and is a good first issue only with very robust statistics knowledge, on how to fit ARIMA models.

A very similar model where we know a good implementation exists would be temporal mixed effects models, see here: https://github.com/sktime/sktime/issues/1767

Although neither is an "easy" starting point, for your first contribution to open source, I suggest to pick sth that is content-wise light, to learn the GitHub contribution workflow.

fkiraly commented 8 months ago

Continuing discussion from https://github.com/sktime/sktime/pull/5689#issuecomment-1875073761:

If i am not wrong at understanding, the GlobalArima have static parameters so what data we should use to train parameters of Global Arima or something?

The data would be passed in fit, in pandas MultiIndex format or similar.