Nested forecasting - common recipe feature engineering causing issue in model calibration

daniepi commented 7 months ago

Hi @mdancho84, First and foremost thanks for this amazing suite of modeltime packages. I am trying to model many individual time series using nested forecasting as mentiond here: https://business-science.github.io/modeltime/articles/nested-forecasting.html

I came across a peculiar problem with using a commonly defined recipe with date-based features and having timeseries of differing lengths and not fully overlapping periods.

With recipe like this

rec_date_feats <- recipe(y~ date, data = extract_nested_train_split(nested_data_tbl)) |>
  step_timeseries_signature(date) |>
  step_rm(date) |>
  step_normalize(date_index.num) |>
  step_zv(all_predictors()) |>
  step_corr(all_numeric_predictors(), threshold = 0.99) |>
  step_dummy(all_nominal_predictors(), one_hot = TRUE)

The training works well and models are fitted well on all time series. I see from the recipes nested in the output of modeltime_nested_fit that not all series where fitted with same features (I guess it is the ZV and CORR removal which decided to remove different features for different series) which is ok and wanted. Unfortunately, models for some series are lacking .calibration_data, so I was trying to figure out why. What I have found out is that it works well for all series which end up with same features as the original recipe definition, while it fails producing .calibration_data for all other series.

Simple example. I have 8 series. I build the recipe as stated above with extract_nested_train_split(nested_data_tbl) which by default uses .row_id = 1, i.e. first series. Let say series nr. 7 and 8 were trained with different feature sets (because their training period was slightly different to series 1-6). Then the calculation of .calibration_data would fail.

I can manualy produce new_data using bake and prep using the recipe specifically extracted for series 7/8 and the predict(model, new_data = ...) and predictions work fine. e.g.

mod <- modeltime_table(nested_modeltime_tbl$.modeltime_tables[[7]]$.model[[2]])
recp <- nested_modeltime_tbl$.modeltime_tables[[7]]$.model[[2]]$pre$actions$recipe$recipe

# This fails
mod |> modeltime_calibrate(new_data = extract_nested_test_split(nested_data_tbl, .row_id = 7))

# This works
bake_test <- bake(prep(recp, training = extract_nested_train_split(nested_data_tbl, .row_id = 7)),
                  new_data = extract_nested_test_split(nested_data_tbl, .row_id = 7))
predict(mod$.model[[1]]$fit$fit, new_data = bake_test |> select(!x))

Finally, when I create the initial recipe with extract_nested_train_split(nested_data_tbl, .row_id = 7), then calibration fails for first 6 series and works for series 7.

I don't know the implementation details well, but I think the problem is that when prediction data for calibration is being constructed, it bakes the recipe trained on the data supplied when recipe is being instantiated and not on the actual (individual time series) training data. Hence it tries to predict a model trained on a given feature set using new data with different feature set.

Is my understanding correct? Thanks for any feedback. :)

daniepi commented 6 months ago

Hi again,

I was digging into to the code. I think the problem arises from mdl_time_forecast https://github.com/business-science/modeltime/blob/master/R/modeltime-forecast.R#L1034

The problem is that mld$blueprint$recipe is a trained recipe as estimated on whatever is the first series in nested data https://github.com/business-science/modeltime/blob/master/R/modeltime-forecast.R#L927-L928

Hence if any of the series does not share the same time index, processing steps that remove some features (like CORR, ZV) will create discrepancy between data used to train model for such series vs. data used to predict on. This seems to create problem for models like XGBoost, where given set of features is expected in predict time, but it receives different set.

mdancho84 commented 6 months ago

Ok sorry haven't had time to dig into it. But yeah the logic there was that the recipe used on the first model can be used on others. Might need to rethink that

business-science / modeltime

Nested forecasting - common recipe feature engineering causing issue in model calibration #243