Open CadePGCM opened 2 years ago
Thanks for your message. Since the predictor will forecast each sub-series decomposed by CEEMDAN, it will decompose the entire series firstly, otherwise we cannot get the test set to validate the result. After decomposition, it will split the trainning set and test set, and then start the forecasting.
In other words you require using the future information to forecast, so there is absolutely no point to this methodology.
see this paper: https://www.inf.ufpr.br/lesoliveira/download/ESWA2017.pdf
Since the sum of IMFs is equal to the original data, I think splitting first or decomposing first will not affect the forecasting result much. Instead, normalization may make some affection. However, as the program is a one-step-ahead forecasting model, decomposing first will also cause some mistakes in the training set, therefore, it needs to run the predictor several times to forecast one day by one day. It is so costly.
Since the sum of IMFs is equal to the original data, I think splitting first or decomposing first will not affect the forecasting result much. Instead, normalization may make some affection.
But the paper I linked suggests it makes a massive difference. I understand the computational costs, but its totally nonsensical to implement a forecaster with future access...
Thanks, I have read the paper. Admittedly, this predictor has the look-ahead bias as most predictors have. It has some future information in the training set but as we do not use the test set for training, it still has meaning. I am not sure why there is a massive difference, but if you use the rolling method to forecast the next 100th
day with a fixed training set, it does.
if you use the rolling method to forecast the next
100th
day with a fixed training set, it does.
Thanks. What do mean by 'it does'?
Does it perform well or worse than expected?
I have tried two ways of the rolling method. One is updating the training set and test set for each run, and then it will output a similar performance as this program (a good one, one-step-ahead forecasting. I am not sure about the performance of this method in this program since I've only tried it on a few datasets.). Another is only using the original training set without any data in the original test set, after the first splitting. When finishing a forecasting run, the result will be added to the training set again and again for the next day's forecasting (a bad one, 100-step-ahead forecasting). The second method can predict the data for a long time in the future, but it is very poor due to the accumulated bias.
Thanks for the insight. I suppose it may be helpful to include a rolling one-step implementation (although it would be compute-intensive) or at least a warning of the lookahead.
I could be mistaken, but I think you might be incorporating some lookahead since the CEEMDAN decomposition views the entire series.