ValueError when predicting

pmoriano commented 2 years ago

Describe the bug

Hello, I am trying to predict a few values from a trained model, but got the below error. Please also find the code to replicate this later.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_234836/3650015805.py in <module>
      5 
      6 arima_model = pm.auto_arima(y=sample, error_action="ignore", supress_warnings=True)
----> 7 predictions = arima_model.predict(3)

~/.conda/envs/basic38/lib/python3.8/site-packages/pmdarima/arima/arima.py in predict(self, n_periods, X, return_conf_int, alpha, **kwargs)
    674         end = arima.nobs + n_periods - 1
    675 
--> 676         f, conf_int = _seasonal_prediction_with_confidence(
    677             arima_res=arima,
    678             start=arima.nobs,

~/.conda/envs/basic38/lib/python3.8/site-packages/pmdarima/arima/arima.py in _seasonal_prediction_with_confidence(arima_res, start, end, X, alpha, **kwargs)
     86     conf_int = results.conf_int(alpha=alpha)
     87     return check_endog(f, dtype=None, copy=False), \
---> 88         check_array(conf_int, copy=False, dtype=None)
     89 
     90 

~/.conda/envs/basic38/lib/python3.8/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

~/.conda/envs/basic38/lib/python3.8/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    718 
    719         if force_all_finite:
--> 720             _assert_all_finite(array,
    721                                allow_nan=force_all_finite == 'allow-nan')
    722 

~/.conda/envs/basic38/lib/python3.8/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
    101                 not allow_nan and not np.isfinite(X).all()):
    102             type_err = 'infinity' if allow_nan else 'NaN, infinity'
--> 103             raise ValueError(
    104                     msg_err.format
    105                     (type_err,

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

To Reproduce

import numpy as np
import pmdarima as pm

sample = np.array([63.75, 63.875, 64.44444444, 65., 64., 63., 62., 59.88888889, 53.33333333, 51.25, 48.875, 37.25])
print(np.isnan(sample)) # To see if there are NaNs. Do not see any. 

arima_model = pm.auto_arima(y=sample, error_action="ignore", supress_warnings=True)
predictions = arima_model.predict(3)

Versions

python=3.8.12
scikit-learn=0.24.2
statsmodels=0.13.0
mpdarima=1.8.2

Expected Behavior

An array with three predictions.

Actual Behavior

The error described above.

Additional Context

No response

aaronreidsmith commented 2 years ago

Can you post the full output of pm.show_versions()? I am unable to reproduce with the versions you provided

$ docker run --rm -it continuumio/miniconda3:4.10.3 /bin/bash
(base) root@eb34ce2b008d:/# conda create --name debug python=3.8.12
...
(base) root@eb34ce2b008d:/# conda activate debug
(debug) root@eb34ce2b008d:/# conda config --add channels conda-forge
(debug) root@eb34ce2b008d:/# conda config --set channel_priority strict
(debug) root@eb34ce2b008d:/# conda install pmdarima
...
(debug) root@eb34ce2b008d:/# conda install -c conda-forge scikit-learn=0.24.2 # Fix sklearn issue
(debug) root@eb34ce2b008d:/# python
Python 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:59:51) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import pmdarima as pm
>>>
>>> sample = np.array([63.75, 63.875, 64.44444444, 65., 64., 63., 62., 59.88888889, 53.33333333, 51.25, 48.875, 37.25])
>>> print(np.isnan(sample)) # To see if there are NaNs. Do not see any. 
[False False False False False False False False False False False False]
>>>
>>> arima_model = pm.auto_arima(y=sample, error_action="ignore", supress_warnings=True)
>>> predictions = arima_model.predict(3)
>>> predictions
array([30.29305912, 26.95415383, 26.27618976])
>>> pm.show_versions()

System:
    python: 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:59:51)  [GCC 9.4.0]
executable: /opt/conda/envs/debug/bin/python
   machine: Linux-5.10.47-linuxkit-x86_64-with-glibc2.10

Python dependencies:
        pip: 21.3
 setuptools: 49.6.0.post20210108
    sklearn: 0.24.2
statsmodels: 0.13.0
      numpy: 1.19.5
      scipy: 1.7.1
     Cython: 0.29.24
     pandas: 1.3.3
     joblib: 1.1.0
   pmdarima: 1.8.2

pmoriano commented 2 years ago

@aaronreidsmith. Thank you for your response. Below pm.show_versions() output.

System:
    python: 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:59:51)  [GCC 9.4.0]
executable: /home/8a6/.conda/envs/basic38/bin/python
   machine: Linux-3.10.0-1160.41.1.el7.x86_64-x86_64-with-glibc2.10

Python dependencies:
        pip: 21.3
 setuptools: 49.6.0.post20210108
    sklearn: 0.24.2
statsmodels: 0.13.0
      numpy: 1.19.5
      scipy: 1.7.1
     Cython: 0.29.24
     pandas: 1.3.3
     joblib: 1.1.0
   pmdarima: 1.8.2

pmoriano commented 2 years ago

@aaronreidsmith. Now it works. I deleted my old environment and created a new conda environment from scratch. pm.show_versions() shows the same as above. I, however, am not sure of the why. Thanks for the help.

aaronreidsmith commented 2 years ago

Glad you were able to get it figured out!

tgsmith61591 commented 2 years ago

I've still be unable to replicate this. Both examples provided (in this issue and in #464) produce predictions:

# this issue
>>> arima_model.predict(3)
# array([30.29305912, 26.95415383, 26.27618976])

# issue 464
>>> arima_model.predict(3)
# array([59.50482616 59.74060519 59.95876076])

Can you provide any other context or system info?

tgsmith61591 commented 2 years ago

I created a fresh conda env:

$ conda create python=3.8 -n pm-tmp

And was able to run the sample successfully:

In [1]: import numpy as np
   ...: import pmdarima as pm
   ...:
   ...: sample = np.array([63.75, 63.875, 64.44444444, 65., 64., 63., 62., 59.88888889, 53.33333333, 51.25, 48.875, 37.25])
   ...: print(np.isnan(sample)) # To see if there are NaNs. Do not see any.
   ...:
   ...: arima_model = pm.auto_arima(y=sample, error_action="ignore", supress_warnings=True)
   ...: predictions = arima_model.predict(3)

[False False False False False False False False False False False False]

In [2]:

In [2]: predictions
Out[2]: array([30.29305912, 26.95415383, 26.27618976])

Could you please install this exact env and try the example again? Trying to determine if this is an OS level issue.

Copy this into environment.yml
Install the env: conda env create -f environment.yml

name: pm-tmp
channels:
  - defaults
dependencies:
  - appnope=0.1.2=py38hecd8cb5_1001
  - backcall=0.2.0=pyhd3eb1b0_0
  - ca-certificates=2021.10.26=hecd8cb5_2
  - certifi=2021.10.8=py38hecd8cb5_0
  - decorator=5.1.0=pyhd3eb1b0_0
  - ipython=7.27.0=py38h01d92e1_0
  - jedi=0.18.0=py38hecd8cb5_1
  - libcxx=12.0.0=h2f01273_0
  - libffi=3.3=hb1e8313_2
  - matplotlib-inline=0.1.2=pyhd3eb1b0_2
  - ncurses=6.2=h0a44026_1
  - openssl=1.1.1l=h9ed2024_0
  - parso=0.8.2=pyhd3eb1b0_0
  - pexpect=4.8.0=pyhd3eb1b0_3
  - pickleshare=0.7.5=pyhd3eb1b0_1003
  - pip=21.2.4=py38hecd8cb5_0
  - prompt-toolkit=3.0.20=pyhd3eb1b0_0
  - ptyprocess=0.7.0=pyhd3eb1b0_2
  - pygments=2.10.0=pyhd3eb1b0_0
  - python=3.8.12=h88f2d9e_0
  - readline=8.1=h9ed2024_0
  - setuptools=58.0.4=py38hecd8cb5_0
  - sqlite=3.36.0=hce871da_0
  - tk=8.6.11=h7bc2e8c_0
  - traitlets=5.1.0=pyhd3eb1b0_0
  - wcwidth=0.2.5=pyhd3eb1b0_0
  - wheel=0.37.0=pyhd3eb1b0_1
  - xz=5.2.5=h1de35cc_0
  - zlib=1.2.11=h1de35cc_3
  - pip:
    - cython==0.29.24
    - joblib==1.1.0
    - numpy==1.21.3
    - pandas==1.3.4
    - patsy==0.5.2
    - pmdarima==1.8.3
    - python-dateutil==2.8.2
    - pytz==2021.3
    - scikit-learn==1.0.1
    - scipy==1.7.1
    - six==1.16.0
    - statsmodels==0.13.0
    - threadpoolctl==3.0.0
    - urllib3==1.26.7
prefix: /opt/miniconda3/envs/pm-tmp

tgsmith61591 commented 2 years ago

@pmoriano were you able to try this with the above env? ^

pmoriano commented 2 years ago

@tgsmith61591. Sorry for the late reply. Please look at #464 to see the data for which this is not working. I am putting that data here again. Thanks for the help.

import numpy as np
import pmdarima as pm

sample = np.array([65.375, 65.75, 66.11111111, 65.375, 66., 66.22222222, 66., 63.44444444, 62.375, 63.125, 60., 59.25])

arima_model = pm.auto_arima(y=sample, error_action="ignore", supress_warnings=True)
predictions = arima_model.predict(3)
print(predictions)

tgsmith61591 commented 2 years ago

Yep.. as mentioned here: https://github.com/alkaline-ml/pmdarima/issues/462#issuecomment-953764293 I was not able to replicate your error with that dataset. Can you please try the environment provided?

import numpy as np
import pmdarima as pm

sample = np.array([65.375, 65.75, 66.11111111, 65.375, 66., 66.22222222, 66., 63.44444444, 62.375, 63.125, 60., 59.25])

arima_model = pm.auto_arima(y=sample, error_action="ignore", supress_warnings=True)
predictions = arima_model.predict(3)

Out[3]: array([59.50482616, 59.74060519, 59.95876076])

pmoriano commented 2 years ago

@tgsmith61591 Thanks for your reply. I tried what you suggested but got the following. Any idea?

Collecting package metadata: done
Solving environment: failed

ResolvePackageNotFound: 
  - certifi==2021.10.8=py38hecd8cb5_0
  - appnope==0.1.2=py38hecd8cb5_1001
  - tk==8.6.11=h7bc2e8c_0
  - ipython==7.27.0=py38h01d92e1_0
  - setuptools==58.0.4=py38hecd8cb5_0
  - zlib==1.2.11=h1de35cc_3
  - pip==21.2.4=py38hecd8cb5_0
  - readline==8.1=h9ed2024_0
  - jedi==0.18.0=py38hecd8cb5_1
  - sqlite==3.36.0=hce871da_0
  - libcxx==12.0.0=h2f01273_0
  - ncurses==6.2=h0a44026_1
  - python==3.8.12=h88f2d9e_0
  - ca-certificates==2021.10.26=hecd8cb5_2
  - xz==5.2.5=h1de35cc_0
  - libffi==3.3=hb1e8313_2
  - openssl==1.1.1l=h9ed2024_0

This is my OS info:

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

alkaline-ml / pmdarima