Allena101 commented 6 months ago

Ok, so i got this very frustrating issue where i have some timeseries that are, not linear, but they are cumulative. Its another electricity data usage dataset.

What is going wrong is that the predictions are BELOW/LOWER than the historic series when there is no way that such a pattern would be learned. Since the series values are either 0 or positive i cant fathom why the predictions would start below the last known value.

However, the actual predictions (what is returned by model.predict() are increasing in an accumulative manner, but they start to low.

First i thought that there was some kind of issue with covariates, but removing them still caused the low predictions. First i used lightGBM, and then I tried nbeats and nhits which also showed the same erroneous predictions.

Finally i tried DeepTCN model which does not follow this pattern!

Still, when i look at target_series.tail() I can see that the last values were higher than the predictions. And a pattern that never decreases (though there are patches of same values (i.e. no increase))

! changing the lags , output_chunk_length and metric parameters does not change this pattern

LGBM_Model = LightGBMModel( lags=3, output_chunk_length=3, metric="rmse", )

LGBM_Model.fit(target) preds = LGBM_Model.predict(3) preds.values()

Its as if the model it not seeing the last values in the target series. Even though i have checked that they are there with target.tail(), plus that TCN seemingly can avoid this pattern.

How would i go about trouble shooting this issue? I just cant understand how those other models would find a decreasing pattern when such a pattern is nowhere in the pretty long series.

To clarify the pattern a bit further: lets say the target series is [1, 2, 2, 3,4 ,5 ,6, 6, 7, 8,8, 9,10] and then the model with model.pred(n=3) would predict [9,10,11] as if it did not look at the last part of the target series.

madtoinou commented 6 months ago

Hi @Allena101,

This seems to be a purely modeling problem; the model are indeed seeing the entire dataset but optimizing their loss don't guarantee "coherence" of the forecast with the training data (especially at the junction point). It becomes even "worse" with deep learning models and covariates, as the model might be capturing some patterns that makes the first forecasted values slightly below the last known value. This mistake at the boundary will ultimately depend on the architecture of the model used.

Tree-based models such as LightGBM are not suited to your problem because they cannot predict values outside of the training range.

If your series is cumulative, it might help to make it stationary. It will slightly change your problem but will make the task easier for models.

Allena101 commented 6 months ago

Hi @Allena101,

This seems to be a purely modeling problem; the model are indeed seeing the entire dataset but optimizing their loss don't guarantee "coherence" of the forecast with the training data (especially at the junction point). It becomes even "worse" with deep learning models and covariates, as the model might be capturing some patterns that makes the first forecasted values slightly below the last known value. This mistake at the boundary will ultimately depend on the architecture of the model used.

Tree-based models such as LightGBM are not suited to your problem because they cannot predict values outside of the training range.

If your series is cumulative, it might help to make it stationary. It will slightly change your problem but will make the task easier for models.

thank you! This is a great response! I have made series stationary before, but i dont know how you can convert the model's predictions afterwards. Similar to scalar.inverse_transform. Perhaps you can reverse if you use the differencing method?

madtoinou commented 6 months ago

The Diff() transformer is reversible, you can indeed use inverse transform:

from darts.dataprocessing.transformers import Diff
from darts.utils.timeseries_generation import linear_timeseries

ts = linear_timeseries(length=100)
tr_diff = Diff()
ts_diff = tr_diff.fit_transform(ts)
ts_inv_diff = tr_diff.inverse_transform(ts_diff)
assert ts.time_index.equals(ts_inv_diff.time_index)
assert all(ts.values() == ts_inv_diff.values())

You can also call the method cumsum() on a TimeSeries but the time index will be 1 step shorter.

unit8co / darts

Issue with predictions values being too low (explain in body) #2238

! changing the lags , output_chunk_length and metric parameters does not change this pattern