Open slava-keshkov opened 3 years ago
This should be possible, but it's not something I've tested. Without your code, I can't see what you've tried and where this error might be coming from. Please provide a minimal reproducible example: https://www.tidyverse.org/help/
My best guess is that you've tried using a model specification with the same regressors for the top levels and other levels of the hierarchy. To 'remove' the regressors where you don't want them, you've provided them in the data as NA. Instead of this, you should specify a model (with the formula) that does not need exogenous regressors. Currently the best way to do this is to split up your tsibble, produce several mables, and then combine the mable into a complete hierarchy.
Here's a complete example for what I think you want to do. Note that you'll need to install the dev version of fabletools with remotes::install_github("tidyverts/fabletools")
as I found a bug with bind_rows(<mable>, <mable>)
in the process:
library(fable)
#> Loading required package: fabletools
library(tsibble)
#>
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, union
# Prepare the data
lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths)) %>%
aggregate_key(key, value = sum(value))
# Split up the data by aggregation
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
agg_ld <- lung_deaths %>% filter(is_aggregated(key))
btm_ld <- lung_deaths %>% filter(!is_aggregated(key))
# Specify and train models
fit_agg_ld <- agg_ld %>%
# For some regressor x
mutate(x = seq_along(value)) %>%
# Estimate a dynamic regression model
model(mdl = ARIMA(value ~ x))
fit_btm_ld <- btm_ld %>%
# All other models are ETS
model(mdl = ETS(value))
# Combine models into single mable of complete hierarchy
fit <- bind_rows(fit_agg_ld, fit_btm_ld)
fit
#> # A mable: 3 x 2
#> # Key: key [3]
#> key mdl
#> <chr*> <model>
#> 1 <aggregated> <LM w/ ARIMA(0,0,1)(1,0,0)[12] errors>
#> 2 fdeaths <ETS(M,N,M)>
#> 3 mdeaths <ETS(M,A,A)>
fit <- fit %>%
# Add MinT reconciliation
reconcile(mdl = min_trace(mdl))
# Produce forecasts
## Need to specify future values of the regressor.
## This can be NA for models that don't use the regressor.
lung_deaths_future <- new_data(lung_deaths, 24) %>%
mutate(x = rep(73:96, 3))
## Forecast (with reconciliation) the lung deaths using the trained models
forecast(fit, new_data = lung_deaths_future)
#> # A fable: 72 x 6 [1M]
#> # Key: key, .model [3]
#> key .model index value x .mean
#> <chr*> <chr> <mth> <dist> <int> <dbl>
#> 1 <aggregated> mdl 1980 Jan N(2664, 31042) 73 2664.
#> 2 <aggregated> mdl 1980 Feb N(2666, 32837) 74 2666.
#> 3 <aggregated> mdl 1980 Mar N(2497, 29099) 75 2497.
#> 4 <aggregated> mdl 1980 Apr N(2030, 20884) 76 2030.
#> 5 <aggregated> mdl 1980 May N(1618, 15179) 77 1618.
#> 6 <aggregated> mdl 1980 Jun N(1461, 13536) 78 1461.
#> 7 <aggregated> mdl 1980 Jul N(1384, 12647) 79 1384.
#> 8 <aggregated> mdl 1980 Aug N(1252, 11325) 80 1252.
#> 9 <aggregated> mdl 1980 Sep N(1246, 11262) 81 1246.
#> 10 <aggregated> mdl 1980 Oct N(1512, 14221) 82 1512.
#> # … with 62 more rows
Created on 2021-10-09 by the reprex package (v2.0.0)
Hi @mitchelloharawild thanks for your solution, its working
With regards to this topic I am also wondering - is it possible to add an exogenous variable for one of the many hierarchical series after the full mable has been already trained and reconciled?
For example: We fit 100 ARIMA models on multiple different levels. We train them and do the reconciliation. We save the trained mable to .RDS file.
Then we want to re-train only one of the series with an added exogenous parameter. Can we then "pull out" one model, re-train it and apply reconciliation without the need for training the rest of the models? Or is it virtually impossible?
Let me know if the question sounds clear enough or I should provide a better example
Looking forward to your replies! 🙌
Hello,
I was adding exogenous regressors to the forecasting hierarchy using ARIMA model and formula notation. It works well when the exogenous values are added to all levels in a hierarchy.
However, I would like to try applying them only to one level in hierarchy.
I tried replacoing values of exogenous values on the levels I do NOT need with NA. This solution causes all models to become "NULL model" after fitting model() object. During fitting, I also get a warning messages saying that exogenous variables have been dropped because the matrix was rank deficient.
I can not drop exogenous values completely, since exogenous values are columns in the data frame with the forecasting groups.
Would love to hear your feedback on that. Seems like an essential functionality to me