First, congratulation for modeltime, what a fantastic ecosystem you are building.

So, consider this example:


model_fit <- modeltime.gluonts::deep_ar(
    id                    = "id",
    freq                  = "M",
    prediction_length     = 24,
    lookback_length       = 36,
    epochs                = 10, 
    num_batches_per_epoch = 50,
    learn_rate            = 0.001,
    num_layers            = 2,
    dropout               = 0.10
  ) %>%
  set_engine("gluonts_deepar") %>%
  fit(formula = value ~ ., data = training(m750_splits))

calib <- modeltime_table(model_fit) %>%
  modeltime_calibrate(new_data = testing(m750_splits))

calib %>% modeltime_forecast(new_data = NULL, actual_data = m750) throws this error:

Using '.calibration_data' to forecast.
Error: Problem occurred during prediction. Most likely cause is missing external regressors. Try using 'new_data' and supply a dataset containing all required columns. Error in model.frame.default(mod_terms, new_data, na.action = na.action, : object is not a matrix

Error: Can't subset columns that don't exist.
x Column `.value` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
Unknown or uninitialised column: `.key`.

Thank you Matt

mdancho84 commented 3 years ago

Error Cause

The error occurs here because it's looking for missing regressors. In this case, it wants a column in your new_data data frame called "id", since this was provided during training.

> calib %>%
+     modeltime_forecast()
Using '.calibration_data' to forecast.
Error: Problem occurred during prediction. Most likely cause is missing external regressors. Try using 'new_data' and supply a dataset containing all required columns. Error in model.frame.default(mod_terms, new_data, na.action = na.action, : object is not a matrix

Error: Can't subset columns that don't exist.
x Column `.value` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.

Used during training:

value ~ . converts to value ~ date + id.

> training(m750_splits)
# A tibble: 282 x 3
   id    date       value
   <fct> <date>     <dbl>
 1 M750  1990-01-01  6370
 2 M750  1990-02-01  6430
 3 M750  1990-03-01  6520
 4 M750  1990-04-01  6580
 5 M750  1990-05-01  6620
 6 M750  1990-06-01  6690
 7 M750  1990-07-01  6000
 8 M750  1990-08-01  5450
 9 M750  1990-09-01  6480
10 M750  1990-10-01  6820
# … with 272 more rows


You'll need to provide the testing data again in modeltime_forecast().

> calib %>%
+     modeltime_forecast(new_data = testing(m750_splits))
# A tibble: 24 x 7
   .model_id .model_desc .key       .index     .value .conf_lo .conf_hi
       <int> <chr>       <fct>      <date>      <dbl>    <dbl>    <dbl>
 1         1 DEEPAR      prediction 2013-07-01  9283.    8030.   10536.
 2         1 DEEPAR      prediction 2013-08-01  9234.    7981.   10487.
 3         1 DEEPAR      prediction 2013-09-01  9746.    8493.   10999.
 4         1 DEEPAR      prediction 2013-10-01 10324.    9071.   11577.
 5         1 DEEPAR      prediction 2013-11-01 10637.    9384.   11890.
 6         1 DEEPAR      prediction 2013-12-01 10446.    9193.   11699.
 7         1 DEEPAR      prediction 2014-01-01 10376.    9123.   11630.
 8         1 DEEPAR      prediction 2014-02-01 10531.    9278.   11784.
 9         1 DEEPAR      prediction 2014-03-01 10622.    9369.   11875.
10         1 DEEPAR      prediction 2014-04-01 10575.    9322.   11828.
# … with 14 more rows