intive-DataScience / tbats

BATS and TBATS forecasting methods
MIT License
178 stars 19 forks source link

TBATS fails for a simple dataset #40

Closed ngupta23 closed 1 year ago

ngupta23 commented 1 year ago

I tried running TBATS on the M3 dataset Quarterly Group and received many errors. I am listing one of them here with a reproducible example. Is there a way to fix this error?

import pandas as pd
from tbats import TBATS

# Create Data (same as M3 Quarterly Q167)
subset = pd.DataFrame(
    {
        "ds": pd.date_range("1984-03-31", periods=35, freq="Q"),
        "y": [
            4140, 4510, 4378, 5010, 5222, 6260, 5094, 5854, 6010, 6428, 5890, 6414,
            6346, 6034, 6770, 7450, 7266, 6788, 6162, 8476, 6480, 5872, 5810, 6522,
            6444, 6110, 5778, 6068, 6018, 6174, 5202, 4820, 5114, 5686, 4946, 
        ]
    }
)

# Create estimator
estimator = TBATS(seasonal_periods=[2], n_jobs=1)

# Fit model
fitted_model = estimator.fit(subset["y"])
  File "C:\Users\Nikhil\OneDrive\my_libraries\my_python_libraries\pycaret\pycaret_dev\time_series_debug.py", line 178, in <module>
    # Fit model
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\tbats\abstract\Estimator.py", line 98, in fit
    best_model = self._do_fit(y)
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\tbats\tbats\TBATS.py", line 80, in _do_fit
    seasonal_model = self._choose_model_from_possible_component_settings(y, components_grid=components_grid)
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\tbats\abstract\Estimator.py", line 144, in _choose_model_from_possible_component_settings
    models = pool.map(self._case_fit, components_grid)
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\multiprocessing\pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\multiprocessing\pool.py", line 774, in get
    raise self._value
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\multiprocessing\pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\multiprocessing\pool.py", line 48, in mapstar
    return list(map(*args))
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\tbats\abstract\Estimator.py", line 131, in _case_fit
    return case.fit(self._y)
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\tbats\abstract\Case.py", line 51, in fit
    arma_model = auto_arima(best_model.resid, stationary=True, trend='n',
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\pmdarima\arima\auto.py", line 422, in auto_arima
    y = check_endog(y, dtype=DTYPE, preserve_series=True)
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\pmdarima\utils\array.py", line 179, in check_endog
    endog = skval.check_array(
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\sklearn\utils\validation.py", line 929, in check_array
    n_samples = _num_samples(array)
  File "C:\Users\Nikhil\.conda\envs\pycaret_dev_sktime_17p1\lib\site-packages\sklearn\utils\validation.py", line 335, in _num_samples
    raise TypeError(
TypeError: Singleton array array(nan) cannot be considered a valid collection.

Related Information:

cotterpl commented 1 year ago

Hi, thank you for reporting the issue. It seems it is the same issue as https://github.com/intive-DataScience/tbats/issues/12 I have returned to investigations. In the mean time please disable box as you mention.

cotterpl commented 1 year ago

Thank you for the exact data to reproduce this issue. It allowed me to find root cause and fix it. I have just released a new version with a fix.

ngupta23 commented 1 year ago

Thanks @cotterpl - I know how important reproducible examples are, hence I provided it :)

Thanks for the quick fix!