alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.55k stars 228 forks source link

Pipeline does not work when using FourierFeaturizer in colab #540

Open ams015 opened 1 year ago

ams015 commented 1 year ago

Describe the bug

I have run this code locally and it has worked, but when I run it on colab I face a value error

To Reproduce

import pandas as pd from pmdarima import pipeline from pmdarima import model_selection from pmdarima import preprocessing as ppc from pmdarima import arima

my dataset has 2k points, but I am going to give the first 10 points for reproducibilit

A=[81.99130735, 82.02854792, 82.02854792, 79.41550171, 80.50789157, 64.73651292, 70.85637886, 75.05835578, 75.89006169, 72.05428365] B=pd.DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04', '2013-01-05', '2013-01-06', '2013-01-07', '2013-01-08', '2013-01-09', '2013-01-10'], dtype='datetime64[ns]', name='Date', freq='D')

xt_a_train=pd.DataFrame(A,index=B) pipe = pipeline.Pipeline([ ("fourier", ppc.FourierFeaturizer(m=365.25,k=4)), ("arima", arima.AutoARIMA(stepwise=True, trace=1, error_action="ignore",start_p=4, start_q=0,d=0, seasonal=True,m=7 , suppress_warnings=True))]) pipe.fit(xt_a_train) print("Model fit:")

Versions

System:
    python: 3.8.10 (default, Nov 14 2022, 12:59:47)  [GCC 9.4.0]
executable: /usr/bin/python3
   machine: Linux-5.10.147+-x86_64-with-glibc2.29

Python dependencies:
        pip: 22.0.4
 setuptools: 57.4.0
    sklearn: 1.2.1
statsmodels: 0.13.5
      numpy: 1.22.4
      scipy: 1.10.1
     Cython: 0.29.33
     pandas: 1.3.5
     joblib: 1.2.0
   pmdarima: 2.0.2
Linux-5.10.147+-x86_64-with-glibc2.29
Python 3.8.10 (default, Nov 14 2022, 12:59:47) 
[GCC 9.4.0]
pmdarima 2.0.2
NumPy 1.22.4
SciPy 1.10.1
Scikit-Learn 1.2.1
Statsmodels 0.13.5

Expected Behavior

Code should run correctly and pick the correct model; it does so on desktop.

Actual Behavior

Error thrown Performing stepwise search to minimize aic ARIMA(4,0,0)(1,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(0,0,0)(0,0,0)[7] intercept : AIC=inf, Time=nan sec ARIMA(1,0,0)(1,0,0)[7] intercept : AIC=inf, Time=nan sec ARIMA(0,0,1)(0,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(0,0,0)(0,0,0)[7] : AIC=inf, Time=nan sec ARIMA(4,0,0)(0,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,0)(1,0,0)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,0)(2,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,0)(1,0,2)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,0)(0,0,0)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,0)(0,0,2)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,0)(2,0,0)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,0)(2,0,2)[7] intercept : AIC=inf, Time=nan sec ARIMA(3,0,0)(1,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(5,0,0)(1,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,1)(1,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(3,0,1)(1,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(5,0,1)(1,0,1)[7] intercept : AIC=inf, Time=nan sec ARIMA(4,0,0)(1,0,1)[7] : AIC=inf, Time=nan sec

ValueError Traceback (most recent call last) in ----> 1 pipe.fit(xt_a_train) 2 print("Model fit:") 3 print(pipe)

4 frames /usr/local/lib/python3.8/dist-packages/pmdarima/pipeline.py in fit(self, y, X, fit_kwargs) 220 # Now fit the final estimator 221 kwargs = named_kwargs[steps[-1][0]] --> 222 self._final_estimator.fit(yt, X=Xt, kwargs) 223 return self 224

/usr/local/lib/python3.8/dist-packages/pmdarima/arima/auto.py in fit(self, y, X, **fit_args) 165 sarimaxkwargs = {} if not self.kwargs else self.kwargs 166 --> 167 self.model = auto_arima( 168 y, 169 X=X,

/usr/local/lib/python3.8/dist-packages/pmdarima/arima/auto.py in auto_arima(y, X, start_p, d, start_q, max_p, max_d, max_q, start_P, D, start_Q, max_P, max_D, max_Q, max_order, m, seasonal, stationary, information_criterion, alpha, test, seasonal_test, stepwise, n_jobs, start_params, trend, method, maxiter, offset_test_args, seasonal_test_args, suppress_warnings, error_action, trace, random, random_state, n_fits, return_valid_fits, out_of_sample_size, scoring, scoring_args, with_intercept, sarimax_kwargs, **fit_args) 699 ) 700 --> 701 sorted_res = search.solve() 702 return _return_wrapper(sorted_res, return_valid_fits, start, trace) 703

/usr/local/lib/python3.8/dist-packages/pmdarima/arima/_auto_solvers.py in solve(self) 458 ) 459 --> 460 sorted_fits = _sort_and_filter_fits(filtered_models_ics) 461 if self.trace and sorted_fits: 462 print(f"\nBest model: {str(sorted_fits[0])}")

/usr/local/lib/python3.8/dist-packages/pmdarima/arima/_auto_solvers.py in _sort_and_filter_fits(models) 566 # if the list is empty, or if it was an ARIMA and it's None 567 if not filtered: --> 568 raise ValueError( 569 "Could not successfully fit a viable ARIMA model " 570 "to input data.\nSee "

ValueError: Could not successfully fit a viable ARIMA model to input data. See http://alkaline-ml.com/pmdarima/no-successful-model.html for more information on why this can happen.

Additional Context

No response

aaronreidsmith commented 1 year ago

I am unable to reproduce either locally or in Google Colab (link). Does your example input work locally? I get this in both environments:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima/pipeline.py", line 222, in fit
    self._final_estimator.fit(yt, X=Xt, **kwargs)
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima/arima/auto.py", line 167, in fit
    self.model_ = auto_arima(
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima/arima/auto.py", line 506, in auto_arima
    D = nsdiffs(xx, m=m, test=seasonal_test, max_D=max_D,
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima/arima/utils.py", line 106, in nsdiffs
    dodiff = testfunc(x)
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima/arima/seasonality.py", line 597, in estimate_seasonal_differencing_term
    stat = self._compute_test_statistic(x)
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima/arima/seasonality.py", line 537, in _compute_test_statistic
    fit = self._fit_ocsb(x, m, lag_term, maxlag)
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pmdarima/arima/seasonality.py", line 493, in _fit_ocsb
    ar_fit = sm.OLS(y, add_constant(mf)).fit(method='qr')
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/statsmodels/tools/tools.py", line 270, in add_constant
    is_nonzero_const = np.ptp(x, axis=0) == 0
  File "<__array_function__ internals>", line 180, in ptp
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 2669, in ptp
    return _methods._ptp(a, axis=axis, out=out, **kwargs)
  File "/Users/asmith/.pyenv/versions/3.10.6/lib/python3.10/site-packages/numpy/core/_methods.py", line 279, in _ptp
    umr_maximum(a, axis, None, out, keepdims),
ValueError: zero-size array to reduction operation maximum which has no identity