alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.55k stars 228 forks source link

[MRG+1] Fix regression introduced by scikit-learn 1.2.0 #532

Closed aaronreidsmith closed 1 year ago

aaronreidsmith commented 1 year ago

Description

This PR attempts to fix the TypeErrors showing up in our nightly builds. I can't explain why, but removing the keyword args is necessary to get anything that is a subtype of _SetOutputMixin to pass

Type of change

How Has This Been Tested?

Checklist:

msat59 commented 1 year ago

I applied the changes manually on my local machine; got this error while fitting a pipeline:

SyntaxError: positional argument follows keyword argument (pipeline.py, line 250) 
Traceback (most recent call last):

  File ~\miniconda3\envs\py38\lib\site-packages\IPython\core\interactiveshell.py:3460 in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  Cell In[1], line 16
    from pmdarima import arima, pipeline

  File ~\miniconda3\envs\py38\lib\site-packages\pmdarima\pipeline.py:250
    _, Xt = transformer.transform(y=None, Xt, **kw)
                                          ^
SyntaxError: positional argument follows keyword argument
aaronreidsmith commented 1 year ago

@msat59 did you rebuild/reinstall the update code? That error reflects the old code, not the one that was changed. Here is how I would test this locally:

$ pip uninstall pmdarima
Found existing installation: pmdarima 2.0.1
Uninstalling pmdarima-2.0.1:
  Would remove:
    /Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima-2.0.1.dist-info/*
    /Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima/*
Proceed (Y/n)? Y
  Successfully uninstalled pmdarima-2.0.1
$ sudo python setup.py install
...
$ cd .. # Need to switch directories or our code will complain
$ python -c 'import pmdarima; pmdarima.show_versions()'
/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")

System:
    python: 3.10.6 (main, Sep 18 2022, 13:28:34) [Clang 13.1.6 (clang-1316.0.21.2.5)]
executable: /Users/asmith/.pyenv/versions/3.10.6/bin/python
   machine: macOS-13.2.1-arm64-arm-64bit

Python dependencies:
        pip: 22.3
 setuptools: 63.2.0
    sklearn: 1.2.0
statsmodels: 0.13.2
      numpy: 1.23.3
      scipy: 1.9.1
     Cython: 0.29.32
     pandas: 1.4.4
     joblib: 1.2.0
   pmdarima: 0.0.0
$ python
>>> from pmdarima import arima
>>> from pmdarima import pipeline
>>> from pmdarima import preprocessing as ppc
>>> import numpy as np
>>>
>>> max_p=5
>>> max_q=5
>>> ff_m=12
>>> ff_k=4
>>> n_jobs=4
>>> trend = 'c'
>>>
>>> train = np.array([ 53.49732848,  55.67194689,  58.38817983,  60.15814887,
...         60.78495554,  60.92771421,  61.30123253,  62.37336819,
...         64.31094699,  66.95783357,  69.91670478,  72.54269204,
...         73.76937463,  72.23523373,  67.0330952 ,  58.5990769 ,
...         49.0156232 ,  41.45220597,  38.91781587,  43.00469642,
...         53.33866198,  67.77987974,  83.06470788,  95.81475843,
...        103.74286951, 106.54764586, 105.90211977, 104.27736115,
...        103.47655163, 104.35387802, 107.59139252, 113.98787597,
...        123.92153058, 136.93004483, 151.58183531, 165.31885435,
...        175.02600621, 178.97489817, 178.31169515, 175.85739037,
...        173.50084961, 171.48318201, 169.87503182, 169.55511273,
...        171.45912256, 175.46712864])
>>>
>>> pipe = pipeline.Pipeline([
...             ("fourier", ppc.FourierFeaturizer(m=ff_m, k=ff_k)),
...             ("arima", arima.AutoARIMA(stepwise=False, trace=5, error_action="ignore",
...                                       seasonal=False,  # because we use Fourier
...                                       trend=trend, n_jobs=n_jobs,
...                                       max_p=max_p, max_q=max_q,
...                                       suppress_warnings=True))
...         ])
>>>
>>> pipe.fit(train)
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=267.537, Time=0.01 sec
Near non-invertible roots for order (0, 1, 1)(0, 0, 0, 0); setting score to inf (at least one inverse root too close to the border of the unit circle: 1.000)
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=inf, Time=0.09 sec
Near non-invertible roots for order (0, 1, 2)(0, 0, 0, 0); setting score to inf (at least one inverse root too close to the border of the unit circle: 0.999)
 ARIMA(0,1,2)(0,0,0)[0] intercept   : AIC=inf, Time=0.11 sec
Near non-invertible roots for order (0, 1, 3)(0, 0, 0, 0); setting score to inf (at least one inverse root too close to the border of the unit circle: 0.992)
 ARIMA(0,1,3)(0,0,0)[0] intercept   : AIC=inf, Time=0.13 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=186.789, Time=0.03 sec
Near non-invertible roots for order (0, 1, 4)(0, 0, 0, 0); setting score to inf (at least one inverse root too close to the border of the unit circle: 0.993)
 ARIMA(0,1,4)(0,0,0)[0] intercept   : AIC=inf, Time=0.15 sec
Near non-invertible roots for order (1, 1, 1)(0, 0, 0, 0); setting score to inf (at least one inverse root too close to the border of the unit circle: 0.998)
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=inf, Time=0.10 sec
 ARIMA(1,1,2)(0,0,0)[0] intercept   : AIC=102.255, Time=0.11 sec
 ARIMA(0,1,5)(0,0,0)[0] intercept   : AIC=39.889, Time=0.17 sec
Near non-invertible roots for order (1, 1, 3)(0, 0, 0, 0); setting score to inf (at least one inverse root too close to the border of the unit circle: 0.994)
 ARIMA(1,1,3)(0,0,0)[0] intercept   : AIC=inf, Time=0.14 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=136.624, Time=0.09 sec
Near non-invertible roots for order (2, 1, 1)(0, 0, 0, 0); setting score to inf (at least one inverse root too close to the border of the unit circle: 0.995)
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=inf, Time=0.11 sec
 ARIMA(1,1,4)(0,0,0)[0] intercept   : AIC=28.359, Time=0.17 sec
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=58.255, Time=0.12 sec
 ARIMA(3,1,0)(0,0,0)[0] intercept   : AIC=101.898, Time=0.10 sec
 ARIMA(2,1,3)(0,0,0)[0] intercept   : AIC=23.556, Time=0.15 sec
 ARIMA(3,1,1)(0,0,0)[0] intercept   : AIC=65.152, Time=0.12 sec
 ARIMA(3,1,2)(0,0,0)[0] intercept   : AIC=30.713, Time=0.12 sec
 ARIMA(4,1,0)(0,0,0)[0] intercept   : AIC=74.625, Time=0.12 sec
 ARIMA(4,1,1)(0,0,0)[0] intercept   : AIC=39.501, Time=0.15 sec
 ARIMA(5,1,0)(0,0,0)[0] intercept   : AIC=49.682, Time=0.14 sec

Best model:  ARIMA(2,1,3)(0,0,0)[0] intercept
Total fit time: 1.464 seconds
Pipeline(steps=[('fourier', FourierFeaturizer(k=4, m=12)),
                ('arima',
                 AutoARIMA(error_action='ignore', n_jobs=4, seasonal=False,
                           stepwise=False, trace=5, trend='c'))])
>>> pipe.predict(2)
46    180.418012
47    184.932259
dtype: float64