rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.16k stars 525 forks source link

[BUG] ExponentialSmoothing() seems to fit the same model for all time series when ts_num > 1 #4080

Open morkapronczay opened 3 years ago

morkapronczay commented 3 years ago

Describe the bug When using cuml.ExponentialSmoothing results change when modeling varying number of time series ts_num. For the same time series, forecasts differ if more time series are included in the model.

Steps/Code to reproduce bug Follow your own arima demo, but instead of ARIMA, try fitting ExponentialSmoothing() with the same parameters for 2, and 4 variables. I attach a picture and also include the modified notebook in pdf. image arima_demo.pdf

Expected behavior This should not have any effect on the fits and the forecast. It seems to me that the same model is fitted on all the time series. For the different time series, separate models should be fitted.

Environment details (please complete the following information):

Additional context This makes this part quite hard to use, as modeling time series 1-by-1 removes any advantages GPU usage create. It seems to be slower than CPU based pure statsmodels. This is my first Issue on Github, any feedback on it is appreciated!

vidosits commented 3 years ago

Here's a minimal example to reproduce this bug:

import cudf
import numpy as np
from cuml import ExponentialSmoothing
from matplotlib import pyplot as plt

data = cudf.Series([1, 2, 3, 4, 5, 6,
                    7, 8, 9, 10, 11, 12,
                    2, 3, 4, 5, 6, 7,
                    8, 9, 10, 11, 12, 13,
                    3, 4, 5, 6, 7, 8, 9,
                    10, 11, 12, 13, 14],
                    dtype=np.float64)
cu_hw = ExponentialSmoothing(data, seasonal_periods=12, ts_num=1)
cu_hw.fit()
cu_pred = cu_hw.forecast(10)

plt.plot(range(0, 36), data.to_pandas().tolist())
plt.plot(range(36, 46), cu_pred.to_pandas().tolist())

two_series = cudf.concat([data, data*5], axis=1)
cu_hw = ExponentialSmoothing(two_series, seasonal_periods=12, ts_num=2)
cu_hw.fit()
cu_pred = cu_hw.forecast(10)

plt.plot(range(0, 36), two_series.to_pandas()[0].tolist())
plt.plot(range(36, 46), cu_pred.to_pandas()[0].tolist())

image minimal_example.ipynb.gz

dantegd commented 3 years ago

cc @Nyrio

vidosits commented 3 years ago

Hey @dantegd , @Nyrio could you take a look at this again if you can? Thanks, in advance!

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

vidosits commented 2 years ago

confirmed, still an issue

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.