tidyverts / fable

Tidy time series forecasting
https://fable.tidyverts.org
GNU General Public License v3.0
564 stars 66 forks source link

Reconciled Arima with exogenous regressors produces unexplainable results #395

Open zac-garland-mc opened 1 year ago

zac-garland-mc commented 1 year ago

Running into a strange occurrence where a reconciled arima model with exogenous regressors produces an unexplained set of results, where all of the reconciled estimates are far above the baseline models (in the absence of the aggregate_key / reconciliation) by a factor of 20% or more.

In the below example, I would expect that the no hierarchy/aggregation models would line up with the arima_base as well as the top_down lining up with the non-aggregated (aggregate). However, they seem to produce much different results. Below I am using a dummy regressor that's intended to be the same value across all series.

I apologize in advance for the verbosity, but currently I am unable to pinpoint exactly where the differences come from.

The only related issue that I could find is https://github.com/tidyverts/fable/issues/373 though it does seem to be slightly different in nature.

library(fpp3)
library(purrr)
library(highcharter)

# base data set

base_data_set <- fpp3::us_employment %>%
  filter(Title %in% c("Private Service-Providing", "Government", "Goods-Producing")) %>%
  index_by(Month) %>%
  mutate(total_employed_jitter = sum(Employed, na.rm = TRUE) %>% jitter(amount = 5000)) %>%
  ungroup()

# no hierarchy / aggregation ---------------------------------------------------------------

# fitted without hierarchy / aggregate_key
fitted_no_hierarchy <- base_data_set %>%
  model(
    arima_no_agg = ARIMA(Employed ~ total_employed_jitter)
  ) %>%
  fitted()

# fitted total without hierarchy / aggregate key
fitted_sum_no_hierarchy <- base_data_set %>%
  index_by(Month) %>%
  summarize(
    total_employed = sum(Employed, na.rm = TRUE),
    total_employed_jitter = mean(total_employed_jitter, na.rm = TRUE)
  ) %>%
  model(
    arima_no_agg = ARIMA(total_employed ~ total_employed_jitter)
  ) %>%
  fitted() %>%
  mutate(Title = "<aggregated>")

joined_no_hierarchy <- fitted_no_hierarchy %>%
  left_join(
    distinct(us_employment,Series_ID,Title)
  ) %>%
  full_join(
    fitted_sum_no_hierarchy
  ) %>%
  as_tibble() %>%
  select(-Series_ID) %>%
  rename(.mean = .fitted)

# with hierarchy / aggregated key -----------------------------------------

# aggregated key
agg_data_set <- base_data_set %>%
  aggregate_key(Title,
                Employed = sum(Employed, na.rm = TRUE),
                total_employed_jitter = mean(total_employed_jitter, na.rm = TRUE)
  )

# fitted reconciled
fitted_recon_arima <- agg_data_set %>%
  model(
    arima_base = ARIMA(Employed ~ total_employed_jitter)
  ) %>%
  reconcile(
    top_down = top_down(arima_base)
  ) %>%
  forecast(agg_data_set)

# joined & plotted  -----------------------------------------------------------------

fitted_recon_arima %>%
  rename(dist_emp = Employed) %>%
  left_join(
    agg_data_set
  ) %>%
  full_join(
    joined_no_hierarchy
  ) %>%
  as_tibble() %>%
  mutate(
    month = as_date(Month),
    Title = as.character(Title)
  ) %>%
  select(Title, .model, month, .mean, actual = Employed) %>%
  mutate(Title = stringr::str_replace(Title,"<aggregated>","aggregated")) %>%
  split(.$Title) %>%
  imap(~ {
    hchart(.x, "line", hcaes(month, .mean, group = .model)) %>%
      hc_add_series(.x %>% distinct(month, actual) %>% mutate(.model = "actual"), "scatter", hcaes(month, actual, group = .model),
                    color = I("#D22A2F1A")) %>%
      hc_title(text = .y)
  }) %>%
  hw_grid()

sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] highcharter_0.9.4.9000 purrr_0.3.4            fable_0.3.1           
#>  [4] feasts_0.2.2           fabletools_0.3.2       tsibbledata_0.4.0     
#>  [7] tsibble_1.1.1          ggplot2_3.3.6          lubridate_1.8.0       
#> [10] tidyr_1.2.0            dplyr_1.0.9            tibble_3.1.7          
#> [13] fpp3_0.5              
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.8.3         lattice_0.20-41      zoo_1.8-10          
#>  [4] assertthat_0.2.1     digest_0.6.29        utf8_1.2.2          
#>  [7] R6_2.5.1             backports_1.4.1      reprex_2.0.1        
#> [10] evaluate_0.15        highr_0.9            pillar_1.7.0        
#> [13] rlang_1.0.2          curl_4.3.2           data.table_1.14.2   
#> [16] rstudioapi_0.13      TTR_0.24.3           R.utils_2.11.0      
#> [19] R.oo_1.25.0          rmarkdown_2.14       styler_1.7.0        
#> [22] stringr_1.4.0        htmlwidgets_1.5.4    igraph_1.3.0        
#> [25] munsell_0.5.0        broom_0.8.0          anytime_0.3.9       
#> [28] compiler_4.0.3       xfun_0.31            pkgconfig_2.0.3     
#> [31] htmltools_0.5.2      tidyselect_1.1.2     fansi_1.0.3         
#> [34] crayon_1.5.1         withr_2.5.0          R.methodsS3_1.8.2   
#> [37] rappdirs_0.3.3       grid_4.0.3           distributional_0.3.0
#> [40] jsonlite_1.8.0       gtable_0.3.0         lifecycle_1.0.1     
#> [43] DBI_1.1.3            magrittr_2.0.3       scales_1.2.0        
#> [46] rlist_0.4.6.2        quantmod_0.4.20      cli_3.3.0           
#> [49] stringi_1.7.6        farver_2.1.0         fs_1.5.2            
#> [52] xts_0.12.1           ellipsis_0.3.2       generics_0.1.2      
#> [55] vctrs_0.4.1          tools_4.0.3          R.cache_0.15.0      
#> [58] glue_1.6.2           fastmap_1.1.0        yaml_2.3.5          
#> [61] colorspace_2.0-3     knitr_1.39
zac-garland-mc commented 1 year ago

One of my colleagues pointed out that another interesting finding is that these results differ when we modify the above to use forecast vs. fitted though we are unsure of why as they are both based on in sample data.

But if we model in log/log then we see similar results between forecast and fitted