tidyverts / fabletools

General fable features useful for extension packages
http://fabletools.tidyverts.org/
89 stars 31 forks source link

Automatically drop rank deficient regressors with warning #404

Open robjhyndman opened 1 month ago

robjhyndman commented 1 month ago

Based on https://stackoverflow.com/q/78388045/144157

library(fable)
library(tsibble)
library(dplyr)

set.seed(123)

# create data
aa <- tibble(idx = 1:12, group = "aa", y = rnorm(12))
bb <- tibble(idx = 7:12, group = "bb", y = rnorm(6))
cc <- tibble(idx = 1:6, group = "cc", y = rnorm(6))
xreg <- tibble(idx = 1:12, x1 = c(0, 0, 1, rep(0, 9)), x2 = c(rep(0, 9), 1, 0, 0))
dat <- bind_rows(aa, bb, cc) |>
  inner_join(xreg, by = "idx") |>
  as_tsibble(index = idx, key = group)

dat |>
  model(arima = ARIMA(y ~ 1 + x1))
#> Warning: Provided exogenous regressors are rank deficient, removing regressors:
#> `x1`
#> Warning: 1 error encountered for arima
#> [1] subscript out of bounds
#> # A mable: 3 x 2
#> # Key:     group [3]
#>   group                       arima
#>   <chr>                     <model>
#> 1 aa    <LM w/ ARIMA(0,0,0) errors>
#> 2 bb                   <NULL model>
#> 3 cc    <LM w/ ARIMA(0,0,0) errors>

Created on 2024-05-12 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.0 (2024-04-24) #> os KDE neon 6.0 #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Melbourne #> date 2024-05-12 #> pandoc 3.1.11.1 @ /usr/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> anytime 0.3.9 2020-08-27 [1] RSPM #> cli 3.6.2 2023-12-11 [1] RSPM #> colorspace 2.1-0 2023-01-23 [1] RSPM #> digest 0.6.35 2024-03-11 [1] RSPM #> distributional 0.4.0 2024-02-07 [1] RSPM (R 4.3.0) #> dplyr * 1.1.4 2023-11-17 [1] RSPM #> ellipsis 0.3.2 2021-04-29 [1] RSPM #> evaluate 0.23 2023-11-01 [1] RSPM #> fable * 0.3.4.9000 2024-04-28 [1] Github (tidyverts/fable@39665f2) #> fabletools * 0.4.2 2024-04-22 [1] RSPM (R 4.4.0) #> fansi 1.0.6 2023-12-08 [1] RSPM #> fastmap 1.1.1 2023-02-24 [1] RSPM #> feasts 0.3.2.9000 2024-04-28 [1] Github (tidyverts/feasts@6e9266e) #> fs 1.6.4 2024-04-25 [1] RSPM (R 4.4.0) #> generics 0.1.3 2022-07-05 [1] RSPM #> ggplot2 3.5.1 2024-04-23 [1] RSPM (R 4.3.3) #> glue 1.7.0 2024-01-09 [1] RSPM #> gtable 0.3.5 2024-04-22 [1] RSPM (R 4.3.3) #> htmltools 0.5.8.1 2024-04-04 [1] RSPM (R 4.3.3) #> knitr 1.46 2024-04-06 [1] RSPM #> lattice 0.22-6 2024-03-20 [1] RSPM #> lifecycle 1.0.4 2023-11-07 [1] RSPM #> lubridate 1.9.3 2023-09-27 [1] RSPM #> magrittr 2.0.3 2022-03-30 [1] RSPM #> munsell 0.5.1 2024-04-01 [1] RSPM #> nlme 3.1-164 2023-11-27 [1] RSPM #> pillar 1.9.0 2023-03-22 [1] RSPM #> pkgconfig 2.0.3 2019-09-22 [1] RSPM #> progressr 0.14.0 2023-08-10 [1] RSPM #> purrr 1.0.2 2023-08-10 [1] RSPM #> R.cache 0.16.0 2022-07-21 [1] RSPM (R 4.3.3) #> R.methodsS3 1.8.2 2022-06-13 [1] RSPM (R 4.3.3) #> R.oo 1.26.0 2024-01-24 [1] RSPM (R 4.3.3) #> R.utils 2.12.3 2023-11-18 [1] RSPM (R 4.3.3) #> R6 2.5.1 2021-08-19 [1] RSPM #> Rcpp 1.0.12 2024-01-09 [1] RSPM #> reprex 2.1.0 2024-01-11 [1] RSPM #> rlang 1.1.3 2024-01-10 [1] RSPM #> rmarkdown 2.26.2 2024-05-01 [1] Github (rstudio/rmarkdown@3d99b8e) #> rstudioapi 0.16.0 2024-03-24 [1] RSPM #> scales 1.3.0 2023-11-28 [1] RSPM #> sessioninfo 1.2.2 2021-12-06 [1] RSPM #> styler 1.10.3 2024-04-07 [1] RSPM (R 4.3.3) #> tibble 3.2.1 2023-03-20 [1] RSPM #> tidyr 1.3.1 2024-01-24 [1] RSPM #> tidyselect 1.2.1 2024-03-11 [1] RSPM #> timechange 0.3.0 2024-01-18 [1] RSPM #> tsibble * 1.1.4 2024-04-28 [1] Github (tidyverts/tsibble@9e0057e) #> urca 1.3-3 2022-08-29 [1] RSPM #> utf8 1.2.4 2023-10-22 [1] RSPM #> vctrs 0.6.5 2023-12-01 [1] RSPM #> withr 3.0.0 2024-01-16 [1] RSPM #> xfun 0.43 2024-03-25 [1] RSPM (R 4.3.3) #> yaml 2.3.8 2023-12-11 [1] RSPM #> #> [1] /usr/local/lib/R/site-library #> [2] /usr/lib/R/site-library #> [3] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
mitchelloharawild commented 1 month ago

Thanks, I think it should behave like lm() and warn that the rank deficient regressors have been removed.

robjhyndman commented 1 month ago

It already gives a warning. The point is that the model with all regressors removed should still be fitted, not return a NULL.

mitchelloharawild commented 1 month ago

Currently it errors (and returns a NULL model) if any of the regressors are rank deficient. In case there is a subset of regressors with full rank, it should use those to estimate the model with a warning.