tidyverts / fabletools

General fable features useful for extension packages
http://fabletools.tidyverts.org/
89 stars 31 forks source link

Unpack a hilo column #337

Closed dkent287 closed 2 years ago

dkent287 commented 2 years ago

I am working through section 5.5. of Forecasting: Principles and Practice (3rd ed), and I am having significant difficulty isolating prediction interval values in a vector (or other similar object I can work with).

I am aware of the unpack_hilo function, but I am having difficulty getting it to work.

Would it be possible to add an example to the unpack_hilo function documentation?

The example could reference the following:

google_2015 %>%
  model(NAIVE(Close)) %>%
  forecast(h = 10) %>%
  hilo()

I should be able to, for example, isolate the lower values for the 80% prediction interval in a single vector.

mitchelloharawild commented 2 years ago

You can use unpack_hilo() on a dataset containing <hilo> columns like so:

library(fable)
#> Loading required package: fabletools
as_tsibble(USAccDeaths) %>%
  model(NAIVE(value)) %>%
  forecast(h = 10) %>%
  hilo() %>% 
  unpack_hilo("80%")
#> # A tsibble: 10 x 7 [1M]
#> # Key:       .model [1]
#>    .model          index            value .mean `80%_lower` `80%_upper`
#>    <chr>           <mth>           <dist> <dbl>       <dbl>       <dbl>
#>  1 NAIVE(value) 1979 Jan  N(9240, 533130)  9240       8304.      10176.
#>  2 NAIVE(value) 1979 Feb N(9240, 1066260)  9240       7917.      10563.
#>  3 NAIVE(value) 1979 Mar N(9240, 1599390)  9240       7619.      10861.
#>  4 NAIVE(value) 1979 Apr N(9240, 2132520)  9240       7369.      11111.
#>  5 NAIVE(value) 1979 May N(9240, 2665650)  9240       7148.      11332.
#>  6 NAIVE(value) 1979 Jun N(9240, 3198779)  9240       6948.      11532.
#>  7 NAIVE(value) 1979 Jul N(9240, 3731909)  9240       6764.      11716.
#>  8 NAIVE(value) 1979 Aug N(9240, 4265039)  9240       6593.      11887.
#>  9 NAIVE(value) 1979 Sep N(9240, 4798169)  9240       6433.      12047.
#> 10 NAIVE(value) 1979 Oct N(9240, 5331299)  9240       6281.      12199.
#> # … with 1 more variable: 95% <hilo>

Created on 2021-12-09 by the reprex package (v2.0.0)

Note how the 80% column is now unpacked into the 80%_lower and 80%_upper columns.


A better alternative for accessing the lower and upper bounds is to simply access them with <hilo>$lower and <hilo>$upper - you can also get to confidence level with <hilo>$level

library(fable)
#> Loading required package: fabletools
as_tsibble(USAccDeaths) %>%
  model(NAIVE(value)) %>%
  forecast(h = 10) %>%
  hilo() %>% 
  dplyr::mutate(lower = `80%`$lower)
#> # A tsibble: 10 x 7 [1M]
#> # Key:       .model [1]
#>    .model          index            value .mean                  `80%`
#>    <chr>           <mth>           <dist> <dbl>                 <hilo>
#>  1 NAIVE(value) 1979 Jan  N(9240, 533130)  9240 [8304.266, 10175.73]80
#>  2 NAIVE(value) 1979 Feb N(9240, 1066260)  9240 [7916.672, 10563.33]80
#>  3 NAIVE(value) 1979 Mar N(9240, 1599390)  9240 [7619.260, 10860.74]80
#>  4 NAIVE(value) 1979 Apr N(9240, 2132520)  9240 [7368.531, 11111.47]80
#>  5 NAIVE(value) 1979 May N(9240, 2665650)  9240 [7147.634, 11332.37]80
#>  6 NAIVE(value) 1979 Jun N(9240, 3198779)  9240 [6947.928, 11532.07]80
#>  7 NAIVE(value) 1979 Jul N(9240, 3731909)  9240 [6764.279, 11715.72]80
#>  8 NAIVE(value) 1979 Aug N(9240, 4265039)  9240 [6593.343, 11886.66]80
#>  9 NAIVE(value) 1979 Sep N(9240, 4798169)  9240 [6432.797, 12047.20]80
#> 10 NAIVE(value) 1979 Oct N(9240, 5331299)  9240 [6280.948, 12199.05]80
#> # … with 2 more variables: 95% <hilo>, lower <dbl>

Created on 2021-12-09 by the reprex package (v2.0.0)

dkent287 commented 2 years ago

That works - thanks.