In the last few weeks I tried to use the FFORMA-System for forecasting a massive block of data (about 110000 timeseries, sometimes just zeros). In doing so I found a bunch of problems. Most of them could be solved, but now I'm stuck.
Here is a short list of the fixed problems:
The parallelisation of THA_features and calc_forecasts did not work well with Linux on IBM-Power.
Fix it by using foreach and doParallel.
THA_features produces NaN if the timeseries is completely filled with zeros.
Fixed it by returning zeros instead of NaN from stl_features in the tsfeatures package if no season could be computed.
train_interval_weights did not work for timeseries with always h=24.
Fixing it by adding check for NULL values.
Hopefully I can upload the changed code soon.
The main reason for this issue is the hyperparameter_search function. I am trying to train the FFORMA-system with the 100000 original M4 timeseries and my 110000 own timeseries combined.
But apparently the hyperparameter search only works with < 145000 timeseries.
If I try to use more timeseries I get the following error:
cannot open compressed file '<PATH>/M4_Hyper.RData', probable reason 'No such file or directory'
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
elapsed = 488.97 Round = 1 max_depth = 10.0000 eta = 0.4000 subsample = 0.9000 colsample_bytree = 0.6000 nrounds = 200.0000 Value = -0.9861
elapsed = 605.43 Round = 2 max_depth = 12.0000 eta = 0.7395 subsample = 0.8938 colsample_bytree = 0.5587 nrounds = 228.0000 Value = -0.9827
elapsed = 110.10 Round = 3 max_depth = 7.0000 eta = 0.2818 subsample = 0.5933 colsample_bytree = 0.9659 nrounds = 59.0000 Value = -0.9876
elapsed = 248.60 Round = 4 max_depth = 8.0000 eta = 0.0042 subsample = 0.8924 colsample_bytree = 0.9966 nrounds = 126.0000 Value = -1.0407
elapsed = 62.96 Round = 5 max_depth = 13.0000 eta = 0.7589 subsample = 0.7186 colsample_bytree = 0.8478 nrounds = 68.0000 Value = NaN
elapsed = 84.47 Round = 6 max_depth = 7.0000 eta = 0.8266 subsample = 0.6338 colsample_bytree = 0.8825 nrounds = 110.0000 Value = NaN
Error in GP_deviance(beta = row, X = X, Y = Y, nug_thres = nug_thres, :
Infinite values of the Deviance Function,
unable to find optimum parameters
Calls: source ... eval -> eval -> <Anonymous> -> apply -> FUN -> GP_deviance
Do you have any idea or direction as to how I can fix this issue?
In the last few weeks I tried to use the FFORMA-System for forecasting a massive block of data (about 110000 timeseries, sometimes just zeros). In doing so I found a bunch of problems. Most of them could be solved, but now I'm stuck.
Here is a short list of the fixed problems:
THA_features
andcalc_forecasts
did not work well with Linux on IBM-Power.foreach
anddoParallel
.THA_features
producesNaN
if the timeseries is completely filled with zeros.NaN
fromstl_features
in thetsfeatures
package if no season could be computed.train_interval_weights
did not work for timeseries with alwaysh=24
.NULL
values.Hopefully I can upload the changed code soon. The main reason for this issue is the
hyperparameter_search
function. I am trying to train the FFORMA-system with the 100000 original M4 timeseries and my 110000 own timeseries combined. But apparently the hyperparameter search only works with < 145000 timeseries. If I try to use more timeseries I get the following error:Do you have any idea or direction as to how I can fix this issue?