alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.57k stars 231 forks source link

trend is not updated after auto_arima call #122

Closed giuliobeseghi closed 5 years ago

giuliobeseghi commented 5 years ago

Description

The attribute trend is not updated after an auto_arima call. However, it seems to be hidden in xxx.arima_res_.params.

Steps/Code to Reproduce

from pmdarima.arima import auto_arima

myValues = [0.36551673, 0.95456757, 0.8503515 , 0.38304662, 0.46526238,
            0.45303935, 0.69334705, 0.44884844, 0.81635232, 0.83594706,
            0.29622491, 0.56847263, 0.59573789, 0.64304685, 0.94439795,
            0.43022153, 0.01149323, 0.3308818 , 0.81116443, 0.08777576,
            0.18624333, 0.36016438, 0.09968634, 0.04711341, 0.60606803,
            0.54327051, 0.34172425, 0.78256552, 0.5463768 , 0.66566099,
            0.48199031, 0.94381409, 0.84672259, 0.19064512, 0.41959927,
            0.70668892, 0.799846  , 0.02851747, 0.08961004, 0.10739706,
            0.89377463, 0.49907531, 0.54157248, 0.09978428, 0.37593662,
            0.09277836, 0.82528869, 0.74212871, 0.95266597, 0.67148465]

arima = auto_arima(myValues, maxiter=200)

print(arima.trend)
print(dict(zip(arima.arima_res_.param_terms, arima.arima_res_.params)))

Expected Results

0.508803937023001
{'trend': 0.508803937023001, 'ma': 0.22033713856720066, 'variance': 0.07748191709499463}

Actual Results

None
{'trend': 0.508803937023001, 'ma': 0.22033713856720066, 'variance': 0.07748191709499463}

Versions

Windows-10-10.0.17134-SP0 Python 3.7.1 (default, Oct 28 2018, 08:39:03) [MSC v.1912 64 bit (AMD64)] NumPy 1.16.2 pmdarima 1.1.1 SciPy 1.2.1 Scikit-Learn 0.19.2 Statsmodels 0.9.0

tgsmith61591 commented 5 years ago

Thanks for the well-filed issue. This is an unfortunate case of name ambiguity... trend is an arg for the ARIMA class itself. Here's the docstr:

    trend : str or None, optional (default=None)
        The trend parameter. If ``with_intercept`` is True, ``trend`` will be
        used. If ``with_intercept`` is False, the trend will be set to a no-
        intercept value.

Inside the actual fit method, here is what's happening with trend:

            if not self._is_seasonal():
                ...
                if trend is None:
                    if self.with_intercept:
                        trend = 'c'
                    else:
                        trend = 'nc'
            else:
                ...
                if trend is None:
                    if self.with_intercept:
                        trend = 'c'
                    else:
                        trend = None

Similar to scikit-learn, fitting the model will create new fit attributes, but we won't alter the input args in order to allow re-fits and updates. So we shouldn't expect that arima.trend will be anything but the value that was passed in, which is None.

Again, the name of the variable probably caused this confusion. The trend inside of the arima.arima_res_ itself is the intercept. If you run arima.summary(), you'll see that 0.5088 value:

                           Statespace Model Results
==============================================================================
Dep. Variable:                      y   No. Observations:                   50
Model:               SARIMAX(0, 0, 1)   Log Likelihood                  -7.030
Date:                Mon, 08 Apr 2019   AIC                             20.060
Time:                        06:52:43   BIC                             25.796
Sample:                             0   HQIC                            22.244
                                 - 50
Covariance Type:                  opg
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.5088      0.049     10.443      0.000       0.413       0.604
ma.L1          0.2203      0.164      1.343      0.179      -0.101       0.542
sigma2         0.0775      0.023      3.420      0.001       0.033       0.122
===================================================================================
Ljung-Box (Q):                       32.39   Jarque-Bera (JB):                 2.05
Prob(Q):                              0.80   Prob(JB):                         0.36
Heteroskedasticity (H):               1.55   Skew:                            -0.10
Prob(H) (two-sided):                  0.38   Kurtosis:                         2.03
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).