sktime / sktime

A unified framework for machine learning with time series
https://www.sktime.net
BSD 3-Clause "New" or "Revised" License
7.73k stars 1.32k forks source link

[BUG] ThetaForecaster-Deseasonalizer returns numpy.array #396

Closed arainboldt closed 3 years ago

arainboldt commented 3 years ago

Describe the bug

2 model.fit(series) 3 model.predict(list(np.arange(3) + series.index[-1])) 4 5 ~/virtualenvs/mtl/lib/python3.7/site-packages/sktime/forecasting/compose/_reduce.py in fit(self, y_train, fh, X_train) 305 # fit base regressor 306 regressor = clone(self.regressor) --> 307 regressor.fit(X_train_tab, y_train_tab.ravel()) 308 self.regressor_ = regressor 309 ~/virtualenvs/mtl/lib/python3.7/site-packages/sktime/forecasting/theta.py in fit(self, y_train, fh, X_train) 124 self.deseasonaliser_ = Deseasonalizer(sp=self.sp, 125 model="multiplicative") --> 126 y_train = self.deseasonaliser_.fit_transform(y_train) 127 128 # fit exponential smoothing forecaster ~/virtualenvs/mtl/lib/python3.7/site-packages/sktime/transformers/single_series/base.py in fit_transform(self, y_train, **fit_params) 43 Transformed time series. 44 """ ---> 45 return self.fit(y_train, **fit_params).transform(y_train) 46 47 def transform(self, y, **transform_params): ~/virtualenvs/mtl/lib/python3.7/site-packages/sktime/transformers/single_series/detrend/_deseasonalise.py in fit(self, y, **fit_params) 65 """ 66 ---> 67 y = check_y(y) 68 self._set_oh_index(y) 69 sp = check_sp(self.sp) ~/virtualenvs/mtl/lib/python3.7/site-packages/sktime/utils/validation/forecasting.py in check_y(y, allow_empty, allow_constant) 65 if not isinstance(y, pd.Series): 66 raise TypeError( ---> 67 f"`y` must be a pandas Series, but found type: {type(y)}") 68 69 # check that series is not empty TypeError: `y` must be a pandas Series, but found type: ``` -->

To Reproduce

from sktime.forecasting.compose import ReducedRegressionForecaster
from sktime.forecasting.theta import ThetaForecaster
import pandas as pd
import numpy as np
y = pd.Series("34 34 39 46 47 50 53 58 61 58 57 59 64 67 72 74".split(' ',-1)).astype(float)
model = ReducedRegressionForecaster(regressor=ThetaForecaster(smoothing_level=.2,sp=1), window_length=3, strategy="recursive")
model.fit(y)
model.predict(pd.Series(list(np.arange(3) + y.index[-1])))

Expected behavior

Additional context

Versions

1 from sktime import show_versions 2 show_versions() ImportError: cannot import name 'show_versions' from 'sktime' (/Users/arainboldt/virtualenvs/mtl/lib/python3.7/site-packages/sktime/__init__.py) ](url) ``` -->
mloning commented 3 years ago

Thanks @arainboldt for posting the bug report!

I'm not sure what you want to achieve. If you want to use the ThetaForecaster for forecasting, you could simply run:

from sktime.forecasting.theta import ThetaForecaster
import pandas as pd
import numpy as np
y = pd.Series("34 34 39 46 47 50 53 58 61 58 57 59 64 67 72 74".split(' ',-1)).astype(float)
model = ThetaForecaster(smoothing_level=.2,sp=1)
model.fit(y)
model.predict(np.arange(3) + y.index[-1])

Your code suggests that you want to reduce the forecasting problem to a tabular regression problem. In that case, you could replace the ThetaForecaster with any scikit-learn regressor. For example:

from sktime.forecasting.compose import ReducedRegressionForecaster
from sklearn.ensemble import RandomForestRegressor
import pandas as pd
import numpy as np
y = pd.Series("34 34 39 46 47 50 53 58 61 58 57 59 64 67 72 74".split(' ',-1)).astype(float)
model = ReducedRegressionForecaster(regressor=RandomForestRegressor(), window_length=3)
model.fit(y)
model.predict(np.arange(3) + y.index[-1])

In either case, the error is not very helpful! We should check the passed regressor to make sure it's a tabular regressor.

arainboldt commented 3 years ago

@mloning thanks for the quick follow up and the recommendation. I was reading through the docs and came across the terms tabular regression along with direct reduction and recursive reduction. I'm not familiar with these terms. Where can I read more about them?

mloning commented 3 years ago

Take a look at this paper: Bontempi, Gianluca & Ben Taieb, Souhaib & Le Borgne, Yann-Aël. (2013). "Machine Learning Strategies for Time Series Forecasting."

We're also in the process of writing a user guide to explain these terms (#377), let me know if you're interested in contributing.

arainboldt commented 3 years ago

@mloning awesome! thanks for the references!