tidyverts / fabletools

General fable features useful for extension packages
http://fabletools.tidyverts.org/
89 stars 31 forks source link

Out of bounds using middle_out for more than 2 hierarchy levels #332

Open StefJen opened 2 years ago

StefJen commented 2 years ago

I'm trying to do a middle-out forecast with a hierarchy of several levels, however when I try with a hierarchy that is more than 2 levels, then I run into this error:

Error: Problem with `mutate()` column `mo_ets`.
i `mo_ets = (function (object, ...) ...`.
x subscript out of bounds

Using the below code works without any issues:

library(fpp3)
tourism <- tsibble::tourism %>%
  mutate(State = recode(State,
                        `New South Wales` = "NSW",
                        `Northern Territory` = "NT",
                        `Queensland` = "QLD",
                        `South Australia` = "SA",
                        `Tasmania` = "TAS",
                        `Victoria` = "VIC",
                        `Western Australia` = "WA"
  )) %>% 
  aggregate_key(State / Region, Trips = sum(Trips)) %>% 
  model(ets=ETS(Trips)) %>%
  reconcile(mo_ets = middle_out(ets),
            method = "forecast_proportions",
            level=State) %>% 
  forecast(h=3)

But adding another level to the hierarchy, so that we now have three instead of two, then I get the error:

library(fpp3)
tourism <- tsibble::tourism %>%
  mutate(State = recode(State,
                        `New South Wales` = "NSW",
                        `Northern Territory` = "NT",
                        `Queensland` = "QLD",
                        `South Australia` = "SA",
                        `Tasmania` = "TAS",
                        `Victoria` = "VIC",
                        `Western Australia` = "WA"
  )) %>% 
  mutate(NewCol = paste(State,"ABC",sep="")) %>%
  aggregate_key(NewCol / State / Region, Trips = sum(Trips)) %>% 
  model(ets=ETS(Trips)) %>%
  reconcile(mo_ets = middle_out(ets),
            method = "forecast_proportions",
            level=State) %>% 
  forecast(h=3)

Reverting to just two levels would in practice mean using a top-down or bottom-up approach, and I'm looking to achieve a true middle-out forecast in a setting with multiple levels. For my specific use case I need 4-7 levels.

My session info for reference:

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=Danish_Denmark.1252  LC_CTYPE=Danish_Denmark.1252    LC_MONETARY=Danish_Denmark.1252 LC_NUMERIC=C                    LC_TIME=Danish_Denmark.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] readr_2.0.2       fable_0.3.1       feasts_0.2.2      fabletools_0.3.1  tsibbledata_0.3.0 tsibble_1.1.0     ggplot2_3.3.5     lubridate_1.8.0   tidyr_1.1.4      
[10] dplyr_1.0.7       tibble_3.1.5      fpp3_0.4.0       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.7           urca_1.3-0           progressr_0.9.0      pillar_1.6.4         compiler_4.1.0       tools_4.1.0          digest_0.6.28       
 [8] bit_4.0.4            lattice_0.20-44      nlme_3.1-152         lifecycle_1.0.1      gtable_0.3.0         anytime_0.3.9        pkgconfig_2.0.3     
[15] rlang_0.4.12         DBI_1.1.1            cli_3.1.0            rstudioapi_0.13      parallel_4.1.0       withr_2.4.2          hms_1.1.1           
[22] generics_0.1.1       vctrs_0.3.8          bit64_4.0.5          grid_4.1.0           tidyselect_1.1.1     glue_1.4.2           R6_2.5.1            
[29] fansi_0.5.0          distributional_0.2.2 vroom_1.5.5          tzdb_0.2.0           purrr_0.3.4          farver_2.1.0         magrittr_2.0.1      
[36] scales_1.1.1         ellipsis_0.3.2       assertthat_0.2.1     colorspace_2.0-2     labeling_0.4.2       utf8_1.2.2           munsell_0.5.0       
[43] crayon_1.4.2  

Also posted on stackoverflow

mitchelloharawild commented 2 years ago

Thanks for the issue, I have seen your stackoverflow post but this is a more appropriate place for it. I believe this is a bug, and will work on fixing it as part of a reconciliation rework.

StefJen commented 2 years ago

Thanks for the issue, I have seen your stackoverflow post but this is a more appropriate place for it. I believe this is a bug, and will work on fixing it as part of a reconciliation rework.

Will make sure to post here another time if I discover something - new to this GitHub universe 👍

haythamomar commented 2 years ago

Same Problem even with Top Down now working more than two levels, I get this error : tourism <- tsibble::tourism %>% mutate(State = recode(State, New South Wales = "NSW", Northern Territory = "NT", Queensland = "QLD", South Australia = "SA", Tasmania = "TAS", Victoria = "VIC", Western Australia = "WA" )) %>% mutate(NewCol = paste(State,"ABC",sep="")) %>% aggregate_key(NewCol / State / Region, Trips = sum(Trips)) %>% model(ets=ETS(Trips)) %>% reconcile(mo_ets = top_down(ets), method = "forecast_proportions", level=State) %>% forecast(h=3) Error in mutate_cols(): ! Problem with mutate() column mo_ets. ℹ mo_ets = (function (object, ...) .... x subscript out of bounds Caused by error in t(rowsum(t(fc_mean[, agg_child_loc, drop = FALSE]), agg_parent))[, agg_parent]: ! subscript out of bounds Run rlang::last_error() to see where the error occurred.

albersonmiranda commented 6 months ago

+1

Is there any workaround to make top_down work with 4 levels? it's a must have in any strict hierarchical benchmarking