business-science / modeltime.gluonts

GluonTS Deep Learning with Modeltime
https://business-science.github.io/modeltime.gluonts/
Other
39 stars 9 forks source link

Using gluonts models with nested data #40

Open rafabelokurows opened 2 years ago

rafabelokurows commented 2 years ago

First of all, let me say I'm a fan of the modeltime ecosystem, and thank you Matt for contributing so much to the Data Science and R communities. Now for my issue, which is actually more of a question: How can I use gluonts models such as DeepAR and N-Beats with nested time series?

I was able to successfully implement the workflow described in your post on Iterative Nested Forecasting with modeltime training several models for all nested time series at once with the _modeltime_nestedfit function. Unfortunately, though, I have not been able to include models available in modeltime.gluonts in the same process and train them side-by-side with ML models. My guess is modeltime's _modeltime_nestedfit does not work with gluonts models yet, but I thought I would ask here for some enlightenment, if it's not too much of a bother. Thank you so much for your time and attention!

The code I used for this experiment:

pacman::p_load(modeltime.gluonts,tidymodels,tidyverse,timetk)
#creating the nested data exactly like the post mentioned above
nested_data_tbl <- walmart_sales_weekly %>% 
  select(id,Date,Weekly_Sales) %>% 
  extend_timeseries(.id_var        = id,
                    .date_var      = Date,
                    .length_future = 52) %>%
  nest_timeseries(.id_var        = id,
                  .length_future = 52,
                  .length_actual = 52*2) %>%
  split_nested_timeseries(.length_test = 52)

#simple recipe
rec_prophet <- recipe(Weekly_Sales ~ Date, extract_nested_train_split(nested_data_tbl)) 

#prophet Workflow that works flawlessly
wflw_prophet <- workflow() %>%
  add_model(
    prophet_boost("regression", seasonality_yearly = TRUE) %>% 
      set_engine("prophet_xgboost")
  ) %>%
  add_recipe(rec_prophet)

#minimal DeepAR workflow that does not work
wflw_deepAR <- workflow() %>% 
  add_model(deep_ar(
    id                    = "id",
    freq                  = "W",
    prediction_length     = 48,
    lookback_length       = 36,
  ) %>%
    set_engine("gluonts_deepar")) %>% 
  add_recipe(rec_prophet)

#fitting models to nested time series
nested_modeltime_tbl <- modeltime_nested_fit(
  nested_data = nested_data_tbl,
  wflw_prophet, wflw_deepAR
)
#> Fitting models on training data... ====>-------------------------- 14% | ET...
#> Fitting models on training data... =========>--------------------- 29% | ET...
#> Fitting models on training data... =============>----------------- 43% | ET...
#> Fitting models on training data... =================>------------- 57% | ET...
#> Fitting models on training data... =====================>--------- 71% | ET...
#> Fitting models on training data... ==========================>---- 86% | ET...
#> Warning: Some models had errors during fitting. Use
#> `extract_nested_error_report()` to review errors.

extract_nested_error_report(nested_modeltime_tbl) 
#> # A tibble: 7 x 4
#>   id    .model_id .model_desc .error_desc                                       
#>   <fct>     <int> <chr>       <chr>                                             
#> 1 1_1           2 DEEP_AR     Column not found: id = 'id'. Make sure your datas~
#> 2 1_3           2 DEEP_AR     Column not found: id = 'id'. Make sure your datas~
#> 3 1_8           2 DEEP_AR     Column not found: id = 'id'. Make sure your datas~
#> 4 1_13          2 DEEP_AR     Column not found: id = 'id'. Make sure your datas~
#> 5 1_38          2 DEEP_AR     Column not found: id = 'id'. Make sure your datas~
#> 6 1_93          2 DEEP_AR     Column not found: id = 'id'. Make sure your datas~
#> 7 1_95          2 DEEP_AR     Column not found: id = 'id'. Make sure your datas~

Created on 2021-12-28 by the reprex package (v2.0.1)

Obs: I tried including a dummy column "id" in my nested train/test frames (and in my recipe), but got the same error.