alkaline-ml / pmdarima

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
https://www.alkaline-ml.com/pmdarima
MIT License
1.58k stars 234 forks source link

LinAlgError: SVD did not converge when running tests with pytest #1

Closed charlesdrotar closed 7 years ago

charlesdrotar commented 7 years ago

This error occurs when running py.test, but not nosetests. The current testing framework is nosetests, but it would be best to switch testing frameworks since nosetests is no longer maintained, and they suggest shifting to pytest.

Below is the stacktrace :

==================================================================================== FAILURES ====================================================================================
_________________________________________________________________________________ test_the_r_src _________________________________________________________________________________

    def test_the_r_src():
        # this is the test the R code provides
        fit = ARIMA(order=(2, 0, 1), trend='c', suppress_warnings=True).fit(abc)

        # the R code's AIC = ~135
        assert abs(135 - fit.aic()) < 1.0

        # the R code's BIC = ~145
        assert abs(145 - fit.bic()) < 1.0

        # R's coefficients:
        #     ar1      ar2     ma1    mean
        # -0.6515  -0.2449  0.8012  5.0370

        # note that statsmodels' mean is on the front, not the end.
        params = fit.params()
        assert_almost_equal(params, np.array([5.0370, -0.6515, -0.2449, 0.8012]), decimal=2)

        # > fit = forecast::auto.arima(abc, max.p=5, max.d=5, max.q=5, max.order=100, stepwise=F)
        fit = auto_arima(abc, max_p=5, max_d=5, max_q=5, max_order=100, seasonal=False,
>                        trend='c', suppress_warnings=True, error_action='ignore')

pyramid/arima/tests/test_arima.py:194:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyramid/arima/auto.py:463: in auto_arima
    for order, seasonal_order in gen)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py:758: in __call__
    while self.dispatch_one_batch(iterator):
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py:608: in dispatch_one_batch
    self._dispatch(tasks)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py:571: in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/_parallel_backends.py:109: in apply_async
    result = ImmediateResult(func)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/_parallel_backends.py:326: in __init__
    self.results = batch()
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py:131: in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
pyramid/arima/auto.py:489: in _fit_arima
    .fit(x, exogenous=xreg, **fit_params)
pyramid/arima/arima.py:252: in fit
    _, self.arima_res_ = _fit_wrapper()
pyramid/arima/arima.py:246: in _fit_wrapper
    **fit_args)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/tsa/arima_model.py:969: in fit
    callback=callback, **kwargs)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/base/model.py:451: in fit
    full_output=full_output)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/base/optimizer.py:184: in _fit
    hess=hessian)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/base/optimizer.py:378: in _fit_lbfgs
    **extra_kwargs)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py:193: in fmin_l_bfgs_b
    **opts)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py:328: in _minimize_lbfgsb
    f, g = func_and_grad(x)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py:273: in func_and_grad
    f = fun(x, *args)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/optimize/optimize.py:292: in function_wrapper
    return function(*(wrapper_args + args))
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/base/model.py:429: in <lambda>
    f = lambda params, *args: -self.loglike(params, *args) / nobs
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/tsa/arima_model.py:790: in loglike
    return self.loglike_kalman(params, set_sigma2)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/tsa/arima_model.py:800: in loglike_kalman
    return KalmanFilter.loglike(params, self, set_sigma2)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/tsa/kalmanf/kalmanfilter.py:649: in loglike
    R_mat, T_mat)
statsmodels/tsa/kalmanf/kalman_loglike.pyx:342: in statsmodels.tsa.kalmanf.kalman_loglike.kalman_loglike_double (statsmodels/tsa/kalmanf/kalman_loglike.c:6510)
    ???
statsmodels/tsa/kalmanf/kalman_loglike.pyx:74: in statsmodels.tsa.kalmanf.kalman_loglike.kalman_filter_double (statsmodels/tsa/kalmanf/kalman_loglike.c:3560)
    ???
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/numpy/linalg/linalg.py:1647: in pinv
    u, s, vt = svd(a, 0)
../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/numpy/linalg/linalg.py:1389: in svd
    u, s, vt = gufunc(a, signature=signature, extobj=extobj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

err = 'invalid value', flag = 8

    def _raise_linalgerror_svd_nonconvergence(err, flag):
>       raise LinAlgError("SVD did not converge")
E       LinAlgError: SVD did not converge

../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/numpy/linalg/linalg.py:99: LinAlgError
================================================================================ warnings summary ================================================================================
pyramid/arima/tests/test_arima.py::test_the_r_src
  /Users/charlesdrotar/anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/linalg/basic.py:1018: RuntimeWarning: internal gelsd driver lwork query error, required iwork dimension not returned. This is likely the result of LAPACK bug 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to 'gelss' driver.
    warnings.warn(mesg, RuntimeWarning)

-- Docs: http://doc.pytest.org/en/latest/warnings.html
================================================================ 1 failed, 33 passed, 1 warnings in 26.81 seconds ================================================================

my environment (Python 2.7):

cov-core==1.15.0
coverage==4.4.1
Cython==0.25.2
nose==1.3.7
nose-cov==1.6
nose-timer==0.7.0
numpy==1.13.0
ordereddict==1.1
pandas==0.20.2
patsy==0.4.1
py==1.4.34
-e git+https://github.com/tgsmith61591/pyramid.git@8bcd98729d46eafd112c628329a277703273cdb8#egg=pyramid
pytest==3.1.1
pytest-cov==2.5.1
python-dateutil==2.6.0
pytz==2017.2
PyYAML==3.12
scikit-learn==0.18.1
scipy==0.19.0
selenium==3.4.3
six==1.10.0
statsmodels==0.8.0
termcolor==1.1.0
xmltodict==0.11.0
tgsmith61591 commented 7 years ago

And nosetests works? Is py.test being run from inside your env or at the /anaconda/bin (or lib) level?

On Jun 9, 2017 9:45 PM, "charlesdrotar" notifications@github.com wrote:

This occurs when running py.test but nosetests. It would be best to switch testing frameworks since nosetests is no longer maintained, and they suggest shifting to pytest.

Below is the stacktrace :

==================================================================================== FAILURES ==================================================================================== _____ test_the_rsrc ____

def test_the_r_src():
    # this is the test the R code provides
    fit = ARIMA(order=(2, 0, 1), trend='c', suppress_warnings=True).fit(abc)

    # the R code's AIC = ~135
    assert abs(135 - fit.aic()) < 1.0

    # the R code's BIC = ~145
    assert abs(145 - fit.bic()) < 1.0

    # R's coefficients:
    #     ar1      ar2     ma1    mean
    # -0.6515  -0.2449  0.8012  5.0370

    # note that statsmodels' mean is on the front, not the end.
    params = fit.params()
    assert_almost_equal(params, np.array([5.0370, -0.6515, -0.2449, 0.8012]), decimal=2)

    # > fit = forecast::auto.arima(abc, max.p=5, max.d=5, max.q=5, max.order=100, stepwise=F)
    fit = auto_arima(abc, max_p=5, max_d=5, max_q=5, max_order=100, seasonal=False,>                        trend='c', suppress_warnings=True, error_action='ignore')

pyramid/arima/tests/test_arima.py:194:


pyramid/arima/auto.py:463: in auto_arima for order, seasonal_order in gen) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py:758: in call while self.dispatch_one_batch(iterator): ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py:608: in dispatch_one_batch self._dispatch(tasks) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py:571: in _dispatch job = self._backend.apply_async(batch, callback=cb) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/_parallel_backends.py:109: in apply_async result = ImmediateResult(func) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/_parallel_backends.py:326: in init self.results = batch() ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py:131: in call return [func(args, kwargs) for func, args, kwargs in self.items] pyramid/arima/auto.py:489: in _fit_arima .fit(x, exogenous=xreg, fitparams) pyramid/arima/arima.py:252: in fit , self.arimares = _fit_wrapper() pyramid/arima/arima.py:246: in _fit_wrapper fit_args) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/tsa/arima_model.py:969: in fit callback=callback, kwargs) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/base/model.py:451: in fit full_output=full_output) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/base/optimizer.py:184: in _fit hess=hessian) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/base/optimizer.py:378: in _fit_lbfgs extra_kwargs) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py:193: in fmin_l_bfgs_b opts) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py:328: in _minimize_lbfgsb f, g = func_and_grad(x) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/optimize/lbfgsb.py:273: in func_and_grad f = fun(x, args) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/optimize/optimize.py:292: in function_wrapper return function((wrapper_args + args)) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/base/model.py:429: in f = lambda params, args: -self.loglike(params, *args) / nobs ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/tsa/arima_model.py:790: in loglike return self.loglike_kalman(params, set_sigma2) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/tsa/arima_model.py:800: in loglike_kalman return KalmanFilter.loglike(params, self, set_sigma2) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/statsmodels/tsa/kalmanf/kalmanfilter.py:649: in loglike R_mat, T_mat) statsmodels/tsa/kalmanf/kalman_loglike.pyx:342: in statsmodels.tsa.kalmanf.kalman_loglike.kalman_loglike_double (statsmodels/tsa/kalmanf/kalman_loglike.c:6510) ??? statsmodels/tsa/kalmanf/kalman_loglike.pyx:74: in statsmodels.tsa.kalmanf.kalman_loglike.kalman_filter_double (statsmodels/tsa/kalmanf/kalman_loglike.c:3560) ??? ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/numpy/linalg/linalg.py:1647: in pinv u, s, vt = svd(a, 0) ../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/numpy/linalg/linalg.py:1389: in svd u, s, vt = gufunc(a, signature=signature, extobj=extobj)


err = 'invalid value', flag = 8

def _raise_linalgerror_svd_nonconvergence(err, flag):>       raise LinAlgError("SVD did not converge")

E LinAlgError: SVD did not converge

../../anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/numpy/linalg/linalg.py:99: LinAlgError================================================================================ warnings summary ================================================================================ pyramid/arima/tests/test_arima.py::test_the_r_src /Users/charlesdrotar/anaconda2/envs/pyramid_2-7/lib/python2.7/site-packages/scipy/linalg/basic.py:1018: RuntimeWarning: internal gelsd driver lwork query error, required iwork dimension not returned. This is likely the result of LAPACK bug 0038, fixed in LAPACK 3.2.2 (released July 21, 2010). Falling back to 'gelss' driver. warnings.warn(mesg, RuntimeWarning) -- Docs: http://doc.pytest.org/en/latest/warnings.html================================================================ 1 failed, 33 passed, 1 warnings in 26.81 seconds ================================================================

my environment:

cov-core==1.15.0coverage==4.4.1Cython==0.25.2nose==1.3.7nose-cov==1.6nose-timer==0.7.0numpy==1.13.0ordereddict==1.1pandas==0.20.2patsy==0.4.1py==1.4.34-e git+https://github.com/tgsmith61591/pyramid.git@8bcd98729d46eafd112c628329a277703273cdb8#egg=pyramidpytest==3.1.1pytest-cov==2.5.1python-dateutil==2.6.0pytz==2017.2PyYAML==3.12scikit-learn==0.18.1scipy==0.19.0selenium==3.4.3six==1.10.0statsmodels==0.8.0termcolor==1.1.0xmltodict==0.11.0

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tgsmith61591/pyramid/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AF10oqljRfd3v7QH48sDBtg0r3TK1-6aks5sCgNLgaJpZM4N1_Vx .

charlesdrotar commented 7 years ago

They are both run from the same activated conda env.

tgsmith61591 commented 7 years ago

(pardon the deleted comments—consolidating all here)

I cannot replicate. Calling pytest works for me. There are only a few differences in our envs:

numpy=1.12.1
scikit-learn=0.19.dev0

When calling pytest in an env, it's important you're actually calling it from the env... for instance, consider:

$ which pytest
//anaconda/bin/pytest
$ source activate py35
(py35) $ which pytest
//anaconda/bin/pytest

Notice it's pointing to the same pytest in both envs (not always the case, but if you have not explicitly conda installed pytest in the env itself, it uses the root version). You should use:

//anaconda/envs/{YOUR_ENV}/bin/pytest
charlesdrotar commented 7 years ago

It appears that is using the right pytest:

(pyramid_2-7) Charless-MBP:pyramid charlesdrotar$ which pytest
/Users/charlesdrotar/anaconda2/envs/pyramid_2-7/bin/pytest
tgsmith61591 commented 7 years ago

I'm going to close this. See this Stack Overflow question (without a good answer). Basically, if there is more than one singular value that is equal to zero, then SVD will not converge. However, this is no fault of pyramid—rather, that of numpy or scipy—and the behavior seems to change depending on the scipy and numpy versions.

If it can be shown that this is a pyramid issue and not a result of the numpy/scipy versions, I will re-open. Furthermore, if any specific version poses a problem (looking at numpy 1.13, since that's the only place our versions really differ), I can set up a wheel to upgrade/downgrade numpy in the setup.py as necessary.