business-science / modeltime.gluonts

GluonTS Deep Learning with Modeltime
https://business-science.github.io/modeltime.gluonts/
Other
39 stars 9 forks source link

Large difference in training time if lookback_length is set explicitly or not #53

Open vidarsumo opened 1 year ago

vidarsumo commented 1 year ago

If you do not set the lookback_length, the training process is much slower than if you set the lookback_length explicitly. The difference is more notable on larger data sets. Here I set the lookback_length to be the same as prediction_length, the code runs fast. If I go with the default, i.e. lookback_length = prediction_length, the code is much slower.

Fit DeepState by setting the lookback length

model_fit_deepstate <- deep_state(
  id                    = "id",
  freq                  = "M",
  prediction_length     = 24,
  lookback_length       = 24,
  epochs                = 5
) %>%
  set_engine("gluonts_deepstate") %>%
  fit(value ~ ., training(m750_splits))

100%|██████████| 50/50 [00:02<00:00, 21.46it/s, epoch=1/5, avg_epoch_loss=8.94]
100%|██████████| 50/50 [00:02<00:00, 23.93it/s, epoch=2/5, avg_epoch_loss=8.26]
100%|██████████| 50/50 [00:02<00:00, 24.28it/s, epoch=3/5, avg_epoch_loss=8.03]
100%|██████████| 50/50 [00:02<00:00, 22.54it/s, epoch=4/5, avg_epoch_loss=7.77]
100%|██████████| 50/50 [00:02<00:00, 23.30it/s, epoch=5/5, avg_epoch_loss=7.41]

Fit DeepState by not setting the lookback length

model_fit_deepstate <- deep_state(
  id                    = "id",
  freq                  = "M",
  prediction_length     = 24,
  epochs                = 5
) %>%
  set_engine("gluonts_deepstate") %>%
  fit(value ~ ., training(m750_splits))

100%|██████████| 50/50 [00:04<00:00, 11.16it/s, epoch=1/5, avg_epoch_loss=8.18]
100%|██████████| 50/50 [00:03<00:00, 12.77it/s, epoch=2/5, avg_epoch_loss=7.15]
100%|██████████| 50/50 [00:04<00:00, 12.19it/s, epoch=3/5, avg_epoch_loss=7.06]
100%|██████████| 50/50 [00:03<00:00, 12.51it/s, epoch=4/5, avg_epoch_loss=6.91]
100%|██████████| 50/50 [00:03<00:00, 13.06it/s, epoch=5/5, avg_epoch_loss=6.66]

On larger data sets I've seen 8x difference.