Nixtla / statsforecast

Lightning ⚡️ fast forecasting with statistical and econometric models.
https://nixtlaverse.nixtla.io/statsforecast
Apache License 2.0
3.99k stars 283 forks source link

[AutoCES] "You must specify `level` when using `prediction_intervals`" #624

Closed RashidBakirov closed 1 year ago

RashidBakirov commented 1 year ago

What happened + What you expected to happen

I get the error "You must specify `level` when using `prediction_intervals`" while calling forecast(h=12) with forecast model AutoCES.

This is a monthly forecast with >200 unique_ids. There are no gaps or 0's, and other automatic forecasting methods work without issues. I have tried to isolate some of the id's replacing y's with random values to be able to share the reproduction script, however this did not reproduce the error.

Is there anything specific in the data that I need to ensure to make AutoCES work? Or any other ideas?

Versions / Dependencies

Environment ``` python version 3.10.6 absl-py==1.0.0 accelerate==0.18.0 adagio==0.2.4 aiobotocore==2.5.4 aiohttp==3.8.4 aioitertools==0.11.0 aiosignal==1.3.1 antlr4-python3-runtime==4.11.1 appdirs==1.4.4 argon2-cffi==21.3.0 argon2-cffi-bindings==21.2.0 astor==0.8.1 asttokens==2.2.1 astunparse==1.6.3 async-timeout==4.0.2 attrs==21.4.0 audioread==3.0.0 azure-core==1.26.4 azure-cosmos==4.3.1b1 azure-storage-blob==12.16.0 azure-storage-file-datalake==12.11.0 backcall==0.2.0 bcrypt==3.2.0 beautifulsoup4==4.11.1 black==22.6.0 bleach==4.1.0 blinker==1.4 blis==0.7.9 boto3==1.24.28 botocore==1.31.17 cachetools==4.2.4 catalogue==2.0.8 category-encoders==2.6.0 certifi==2022.9.14 cffi==1.15.1 chardet==4.0.0 charset-normalizer==2.0.4 click==8.0.4 cloudpickle==2.0.0 cmdstanpy==1.1.0 confection==0.0.4 configparser==5.2.0 convertdate==2.4.0 cryptography==37.0.1 cycler==0.11.0 cymem==2.0.7 Cython==0.29.32 databricks-automl-runtime==0.2.16 databricks-cli==0.17.6 databricks-feature-store==0.12.1 dataclasses-json==0.5.7 datasets==2.12.0 datasetsforecast==0.0.8 dbl-tempo==0.1.23 dbus-python==1.2.18 debugpy==1.5.1 decorator==5.1.1 defusedxml==0.7.1 dill==0.3.4 diskcache==5.6.1 distlib==0.3.6 distro==1.7.0 distro-info===1.1build1 docstring-to-markdown==0.12 entrypoints==0.4 ephem==4.1.4 evaluate==0.4.0 executing==1.2.0 facets-overview==1.0.3 fastjsonschema==2.16.3 fasttext==0.9.2 filelock==3.6.0 Flask @ https://databricks-build-artifacts-manual-staging.s3.amazonaws.com/flask/Flask-1.1.2%2Bdb1-py2.py3-none-any.whl?AWSAccessKeyId=AKIAX7HWM34HCSVHYQ7M&Expires=2001354391&Signature=bztIumr2jXFbisF0QicZvqbvT9s%3D flatbuffers==23.5.9 fonttools==4.25.0 frozenlist==1.3.3 fs==2.4.16 fsspec==2023.6.0 fugue==0.8.6 fugue-sql-antlr==0.1.6 future==0.18.2 gast==0.4.0 gitdb==4.0.10 GitPython==3.1.27 google-api-core==2.8.2 google-auth==1.33.0 google-auth-oauthlib==0.4.6 google-cloud-core==2.3.2 google-cloud-storage==2.9.0 google-crc32c==1.5.0 google-pasta==0.2.0 google-resumable-media==2.5.0 googleapis-common-protos==1.56.4 greenlet==1.1.1 grpcio==1.48.1 grpcio-status==1.48.1 gunicorn==20.1.0 gviz-api==1.10.0 h5py==3.7.0 hijri-converter==2.3.1 holidays==0.22 horovod==0.27.0 htmlmin==0.1.12 httplib2==0.20.2 huggingface-hub==0.14.1 idna==3.3 ImageHash==4.3.1 imbalanced-learn==0.8.1 importlib-metadata==4.11.3 ipykernel==6.17.1 ipython==8.10.0 ipython-genutils==0.2.0 ipywidgets==7.7.2 isodate==0.6.1 itsdangerous==2.0.1 jedi==0.18.1 jeepney==0.7.1 Jinja2==2.11.3 jmespath==0.10.0 joblib==1.2.0 joblibspark==0.5.1 jsonschema==4.16.0 jupyter-client==7.3.4 jupyter_core==4.11.2 jupyterlab-pygments==0.1.2 jupyterlab-widgets==1.0.0 keras==2.11.0 keyring==23.5.0 kiwisolver==1.4.2 korean-lunar-calendar==0.3.1 langchain==0.0.152 langcodes==3.3.0 launchpadlib==1.10.16 lazr.restfulclient==0.14.4 lazr.uri==1.0.6 lazy_loader==0.2 libclang==15.0.6.1 librosa==0.10.0 lightgbm==3.3.5 llvmlite==0.38.0 LunarCalendar==0.0.9 Mako==1.2.0 Markdown==3.3.4 MarkupSafe==2.0.1 marshmallow==3.19.0 marshmallow-enum==1.5.1 matplotlib==3.5.2 matplotlib-inline==0.1.6 mccabe==0.7.0 mistune==0.8.4 mleap==0.20.0 mlflow-skinny==2.3.1 mlforecast==0.9.1 more-itertools==8.10.0 msgpack==1.0.5 multidict==6.0.4 multimethod==1.9.1 multiprocess==0.70.12.2 murmurhash==1.0.9 mypy-extensions==0.4.3 nbclient==0.5.13 nbconvert==6.4.4 nbformat==5.5.0 nest-asyncio==1.5.5 networkx==2.8.4 nltk==3.7 nodeenv==1.8.0 notebook==6.4.12 numba==0.55.1 numexpr==2.8.4 numpy==1.21.6 oauthlib==3.2.0 openai==0.27.4 openapi-schema-pydantic==1.2.4 opt-einsum==3.3.0 packaging==21.3 pandas==1.4.4 pandocfilters==1.5.0 paramiko==2.9.2 parso==0.8.3 pathspec==0.9.0 pathy==0.10.1 patsy==0.5.2 petastorm==0.12.1 pexpect==4.8.0 phik==0.12.3 pickleshare==0.7.5 Pillow==9.2.0 platformdirs==2.5.2 plotly==5.9.0 pluggy==1.0.0 pmdarima==2.0.3 polars==0.18.15 pooch==1.7.0 preshed==3.0.8 prometheus-client==0.14.1 prompt-toolkit==3.0.36 prophet==1.1.2 protobuf==3.19.4 psutil==5.9.0 psycopg2==2.9.3 ptyprocess==0.7.0 pure-eval==0.2.2 pyarrow==8.0.0 pyasn1==0.4.8 pyasn1-modules==0.2.8 pybind11==2.10.4 pycparser==2.21 pydantic==1.10.6 pyflakes==3.0.1 Pygments==2.11.2 PyGObject==3.42.1 PyJWT==2.3.0 PyMeeus==0.5.12 PyNaCl==1.5.0 pyodbc==4.0.32 pyparsing==3.0.9 pyright==1.1.294 pyrsistent==0.18.0 pytesseract==0.3.10 python-apt==2.4.0+ubuntu1 python-dateutil==2.8.2 python-editor==1.0.4 python-lsp-jsonrpc==1.0.0 python-lsp-server==1.7.1 pytoolconfig==1.2.2 pytz==2022.1 PyWavelets==1.3.0 PyYAML==6.0 pyzmq==23.2.0 qpd==0.4.4 regex==2022.7.9 requests==2.28.1 requests-oauthlib==1.3.1 responses==0.18.0 rope==1.7.0 rsa==4.9 s3fs==2023.6.0 s3transfer==0.6.0 scikit-learn==1.1.1 scipy==1.9.1 seaborn==0.11.2 SecretStorage==3.3.1 Send2Trash==1.8.0 sentence-transformers==2.2.2 sentencepiece==0.1.97 shap==0.41.0 simplejson==3.17.6 six==1.16.0 slicer==0.0.7 smart-open==5.2.1 smmap==5.0.0 soundfile==0.12.1 soupsieve==2.3.1 soxr==0.3.5 spacy==3.5.1 spacy-legacy==3.0.12 spacy-loggers==1.0.4 spark-tensorflow-distributor==1.0.0 SQLAlchemy==1.4.39 sqlglot==17.15.1 sqlparse==0.4.2 srsly==2.4.6 ssh-import-id==5.11 stack-data==0.6.2 statsforecast==1.6.0 statsmodels==0.13.2 tabulate==0.8.10 tangled-up-in-unicode==0.2.0 tenacity==8.1.0 tensorboard==2.11.0 tensorboard-data-server==0.6.1 tensorboard-plugin-profile==2.11.2 tensorboard-plugin-wit==1.8.1 tensorflow-cpu==2.11.1 tensorflow-estimator==2.11.0 tensorflow-io-gcs-filesystem==0.32.0 termcolor==2.3.0 terminado==0.13.1 testpath==0.6.0 thinc==8.1.10 threadpoolctl==2.2.0 tiktoken==0.3.3 tokenize-rt==4.2.1 tokenizers==0.13.3 tomli==2.0.1 torch==1.13.1+cpu torchvision==0.14.1+cpu tornado==6.1 tqdm==4.64.1 traitlets==5.1.1 transformers==4.28.1 triad==0.9.1 typeguard==2.13.3 typer==0.7.0 typing-inspect==0.8.0 typing_extensions==4.3.0 ujson==5.4.0 unattended-upgrades==0.1 urllib3==1.26.11 virtualenv==20.16.3 visions==0.7.5 wadllib==1.3.6 wasabi==1.1.1 wcwidth==0.2.5 webencodings==0.5.1 websocket-client==0.58.0 Werkzeug==2.0.3 whatthepatch==1.0.2 widgetsnbextension==3.6.1 window-ops==0.0.14 wrapt==1.14.1 xgboost==1.7.5 xlrd==2.0.1 xxhash==3.2.0 yapf==0.31.0 yarl==1.9.2 ydata-profiling==4.1.2 zipp==3.8.0 ```

Reproduction script

-

Issue Severity

Medium: It is a significant difficulty but I can work around it.

jmoralez commented 1 year ago

Hey @RashidBakirov, thanks for using statsforecast. Can you provide the command you're running? It seems like you're missing the level argument in the predict/forecast method.

RashidBakirov commented 1 year ago

Hey @RashidBakirov, thanks for using statsforecast. Can you provide the command you're running? It seems like you're missing the level argument in the predict/forecast method.

Hi, as far as I understand, levelargument is optional, however I did try both sf.forecast(h=12) and sf.forecast(h=12, level=[90]) with the same result.

jmoralez commented 1 year ago

How are you instantiating the AutoCES model? Are you providing prediction_intervals?

RashidBakirov commented 1 year ago

Nope. Simply

season_length=12
models=[AutoCES(season_length=season_length)]
jmoralez commented 1 year ago

Can you provide a reproducible example? This works:

from statsforecast import StatsForecast
from statsforecast.models import AutoCES
from statsforecast.utils import generate_series

series = generate_series(2, freq='M')
sf = StatsForecast(models=[AutoCES(season_length=12)], freq='M')
sf.forecast(df=series, h=24)
RashidBakirov commented 1 year ago

Yes, I tried the example from the tutorial which works as well, so I think the issue may be in my data. Tried to provide an example dataset by replacing my actual values with random values, but this run without any problems... I will post here if I will be able to isolate the problem.

RashidBakirov commented 1 year ago

I think I have managed to isolate the error. Some items I was trying to forecast had the same value across the whole series; after removing those I was able to run AutoCES on the rest. Removing these items also eliminated the warning /databricks/python/lib/python3.10/site-packages/statsmodels/tsa/stattools.py:681: RuntimeWarning: invalid value encountered in true_divide acf = avf[: nlags + 1] / avf[0] which I have traced to come from AutoTheta

jmoralez commented 1 year ago

The error you were getting with the constant series was the one in this issue?

RashidBakirov commented 1 year ago

yes

jmoralez commented 1 year ago

Can you provide a reproducible example?

fehtemam commented 1 year ago

How are you instantiating the AutoCES model? Are you providing prediction_intervals?

@jmoralez Is there a new syntax in 1.6.0 to do this? I never had an issue like this before. How do you pass prediction_intevals? Is it different than passing level=[95] ?

Update: It seems the syntax is changed. I updated my model to

intervals = ConformalIntervals(h=52, n_windows=2) models = [ AutoCES(model='S', season_length=52, prediction_intervals=intervals) ]

@jmoralez Can you please tell me what n_windows is in the ConformalIntervals method? I couldn't find it in the docs.

jmoralez commented 1 year ago

@fehtemam the prediction intervals allows the models to compute the prediction intervals using conformal intervals. However, you should be able to use the same code for models that can produce them out of the box, like CES. Can you provide the code you were running before that doesn't work now?

fehtemam commented 1 year ago

@jmoralez sure. I had this code working for about six months:

models = [
        SeasonalNaive(season_length=52),
        SeasonalWindowAverage(season_length=52, window_size=5),
        AutoCES(model='S', season_length=52)
    ]

# forecast horizon
hrz = 52
# level for confidence intervals
lv = [95]

# forecast object
sf = StatsForecast(
    df=df_train,
    models=models,
    freq='W'
)
frcst_df = sf.forecast(h=hrz, level=lv, X_df=df_exg)

I updated to 1.6 the other day and it complained about having to pass prediction intervals. So I changed the code to:

# forecast horizon
hrz = 52
# level for confidence intervals
lv = [95]
intervals = ConformalIntervals(h=hrz, n_windows=2)

models = [
    SeasonalNaive(season_length=52, prediction_intervals=intervals),
    SeasonalWindowAverage(season_length=52, window_size=5, prediction_intervals=intervals),
    AutoCES(model='S', season_length=52, prediction_intervals=intervals)
    ]

# forecast object
sf = StatsForecast(
    df=df_train,
    models=models,
    freq='W'
)
frcst_df = sf.forecast(h=hrz, level=lv, X_df=df_exg)

And now the code works. ¯\_(ツ)_/¯

jmoralez commented 1 year ago

@fehtemam the error is coming from SeasonalWindowAverage, which can't compute intervals out of the box. In 1.5.0 the output didn't have intervals for that model, in 1.6.0 it raises an error. In order to fix it you can set the prediction_intervals argument for that model only e.g.

intervals = ConformalIntervals(h=hrz, n_windows=2)
models = [
    SeasonalNaive(season_length=52),
    SeasonalWindowAverage(season_length=52, window_size=5, prediction_intervals=intervals),
    AutoCES(model='S', season_length=52)
]

or you can remove it from the list and compute the prediction intervals for the other two models.

github-actions[bot] commented 1 year ago

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one.

rajshah4 commented 8 months ago

I noticed the tutorial - https://nixtlaverse.nixtla.io/statsforecast/docs/tutorials/statisticalneuralmethods.html also had errors for the prediction interval. To get the tutorial to work, I needed to add the intervals to each of the models.

jmoralez commented 8 months ago

Thanks @rajshah4, would you like to open a PR with that fix?