Poor out-of-sample performance in global model

JulianNyarko commented 2 years ago

Love darts! I have a problem that I can't wrap my head around, though. The out-of-sample performance of global models seems to be quite poor when compared to univariate time series modeling.

I have a simulated dataset with ~500 target series and ~400 covariate series. My training dataset spans periods 1-30, validation is 31-40, I want to predict 41-50. When training an NBEATS() model for an individual target series and checking historical_forecasts(), I get good results throughout. However, when training a global model and using historical_forecasts() on an individual series, then the predictions during the training period are really great, but the predictions during the validation period are very poor. This confuses me. I could understand if both in- and out-of-sample predictions for the global model were poor during backtesting, but the fact that only out-of-sample predictions for the global model are poor makes me think that something might be wrong with the validation step.

I understand this might be difficult to answer in the abstract, but is that result expected at all?

Just for reference, this is how I train and backtest the univariate model:

model_NBEATS = NBEATSModel(input_chunk_length=input_size, output_chunk_length=output_size, n_epochs=epochs, force_reset=True,
                      batch_size=1024, torch_device_str="cuda:0")
model_NBEATS.fit(series=ts_tr, past_covariates=cov_list_all[0], 
                 val_series=ts_tr_val, 
                 val_past_covariates=cov_list_all, 
                 verbose=True)

hf_cov = model_NBEATS.historical_forecasts(ts_tr,
                                              past_covariates=cov_list_all[0],
                                              start=0.1,
                                              forecast_horizon=output_size,
                                              stride=1,
                                              retrain=False,
                                              verbose=True)

And for the global model:

model_NBEATS_global = NBEATSModel(input_chunk_length=input_size, output_chunk_length=output_size, n_epochs=epochs, force_reset=True,
                      batch_size=1024, torch_device_str="cuda:0")
model_NBEATS_global.fit(series=ts_list_all, past_covariates=cov_list_all, 
                 val_series=ts_list_val, 
                 val_past_covariates=cov_list_all, 
                 verbose=True)

hf_cov_global = model_NBEATS_global.historical_forecasts(ts_tr,
                                              past_covariates=cov_list_all[0],
                                              start=0.1,
                                              forecast_horizon=output_size,
                                              stride=1,
                                              retrain=False,
                                              verbose=True)

hrzn commented 2 years ago

Hi @JulianNyarko , from your description it looks like the model might have overfit the training part of the data. I see that you closed the issue - did you manage to solve it somehow?

JulianNyarko commented 2 years ago

Thanks, @hrzn. Yes, I eventually figured out it was a pretty standard overfitting issue. It is a bit hard to address in this case, partially because I can’t track the train and val loss over epochs to stare at the learning curve. Does that feature exist / is it planned? Could open a new request if that helps!

hrzn commented 2 years ago

Hi @JulianNyarko, the way to do it is with Tensorboard. Set log_tensorboard=True and then you can visualise the training and validation losses in Tensorboard.

unit8co / darts

Poor out-of-sample performance in global model #549