tidyverts / fabletools

General fable features useful for extension packages
http://fabletools.tidyverts.org/
89 stars 31 forks source link

How to change period in `features(var, feat_stl)` #371

Open Aariq opened 1 year ago

Aariq commented 1 year ago

I can't seem to figure out how to add arguments to feat_stl when it is used inside of features(). I would expect an anonymous function to work, but it doesn't.

Reprex:

library(fabletools)
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, union
library(feasts)

#works
tourism |> 
  model(STL(Trips ~ season("1 year")))
#> # A mable: 304 x 4
#> # Key:     Region, State, Purpose [304]
#>    Region         State              Purpose  `STL(Trips ~ season("1 year"))`
#>    <chr>          <chr>              <chr>                            <model>
#>  1 Adelaide       South Australia    Business                           <STL>
#>  2 Adelaide       South Australia    Holiday                            <STL>
#>  3 Adelaide       South Australia    Other                              <STL>
#>  4 Adelaide       South Australia    Visiting                           <STL>
#>  5 Adelaide Hills South Australia    Business                           <STL>
#>  6 Adelaide Hills South Australia    Holiday                            <STL>
#>  7 Adelaide Hills South Australia    Other                              <STL>
#>  8 Adelaide Hills South Australia    Visiting                           <STL>
#>  9 Alice Springs  Northern Territory Business                           <STL>
#> 10 Alice Springs  Northern Territory Holiday                            <STL>
#> # … with 294 more rows

#works
tourism |> 
  features(Trips, feat_stl)
#> # A tibble: 304 × 12
#>    Region  State Purpose trend…¹ seaso…² seaso…³ seaso…⁴ spiki…⁵ linea…⁶ curva…⁷
#>    <chr>   <chr> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 Adelai… Sout… Busine…   0.464   0.407       3       1 1.58e+2  -5.31   71.6  
#>  2 Adelai… Sout… Holiday   0.554   0.619       1       2 9.17e+0  49.0    78.7  
#>  3 Adelai… Sout… Other     0.746   0.202       2       1 2.10e+0  95.1    43.4  
#>  4 Adelai… Sout… Visiti…   0.435   0.452       1       3 5.61e+1  34.6    71.4  
#>  5 Adelai… Sout… Busine…   0.464   0.179       3       0 1.03e-1   0.968  -3.22 
#>  6 Adelai… Sout… Holiday   0.528   0.296       2       1 1.77e-1  10.5    24.0  
#>  7 Adelai… Sout… Other     0.593   0.404       2       2 4.44e-4   4.28    3.19 
#>  8 Adelai… Sout… Visiti…   0.488   0.254       0       3 6.50e+0  34.2    -0.529
#>  9 Alice … Nort… Busine…   0.534   0.251       0       1 1.69e-1  23.8    19.5  
#> 10 Alice … Nort… Holiday   0.381   0.832       3       1 7.39e-1 -19.6    10.5  
#> # … with 294 more rows, 2 more variables: stl_e_acf1 <dbl>, stl_e_acf10 <dbl>,
#> #   and abbreviated variable names ¹​trend_strength, ²​seasonal_strength_year,
#> #   ³​seasonal_peak_year, ⁴​seasonal_trough_year, ⁵​spikiness, ⁶​linearity,
#> #   ⁷​curvature

#doesn't work
tourism |> 
  features(Trips, ~feat_stl(., .period = "1 year"))
#> Error in `squash()`:
#> ! Only lists can be spliced

#> Backtrace:
#>     ▆
#>  1. ├─fabletools::features(tourism, Trips, ~feat_stl(., .period = "1 year"))
#>  2. ├─fabletools:::features.tbl_ts(tourism, Trips, ~feat_stl(., .period = "1 year"))
#>  3. │ └─fabletools:::features_impl(.tbl, list(.var), features, ...)
#>  4. │   ├─fabletools:::map(squash(features), as_function)
#>  5. │   │ └─base::lapply(.x, .f, ...)
#>  6. │   └─rlang::squash(features)
#>  7. └─rlang::abort(message = message)

Created on 2022-10-25 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.0 (2022-04-22) #> os macOS Big Sur/Monterey 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/New_York #> date 2022-10-25 #> pandoc 2.19.2 @ /usr/local/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> ! package * version date (UTC) lib source #> P anytime 0.3.9 2020-08-27 [?] CRAN (R 4.2.0) #> P assertthat 0.2.1 2019-03-21 [?] CRAN (R 4.2.0) #> P cli 3.4.1 2022-09-23 [?] CRAN (R 4.2.0) #> P colorspace 2.0-3 2022-02-21 [?] CRAN (R 4.2.0) #> P DBI 1.1.3 2022-06-18 [?] CRAN (R 4.2.0) #> P digest 0.6.29 2021-12-01 [?] CRAN (R 4.2.0) #> P distributional 0.3.1 2022-09-02 [?] CRAN (R 4.2.0) #> P dplyr 1.0.10 2022-09-01 [?] CRAN (R 4.2.0) #> P ellipsis 0.3.2 2021-04-29 [?] CRAN (R 4.2.0) #> P evaluate 0.17 2022-10-07 [?] CRAN (R 4.2.0) #> P fabletools * 0.3.2 2021-11-29 [?] CRAN (R 4.2.0) #> P fansi 1.0.3 2022-03-24 [?] CRAN (R 4.2.0) #> P farver 2.1.1 2022-07-06 [?] CRAN (R 4.2.0) #> P fastmap 1.1.0 2021-01-25 [?] CRAN (R 4.2.0) #> P feasts * 0.3.0 2022-09-01 [?] CRAN (R 4.2.0) #> P fs 1.5.2 2021-12-08 [?] CRAN (R 4.2.0) #> P generics 0.1.3 2022-07-05 [?] CRAN (R 4.2.0) #> P ggplot2 3.3.6 2022-05-03 [?] CRAN (R 4.2.0) #> P glue 1.6.2 2022-02-24 [?] CRAN (R 4.2.0) #> P gtable 0.3.1 2022-09-01 [?] CRAN (R 4.2.0) #> P highr 0.9 2021-04-16 [?] CRAN (R 4.2.0) #> P htmltools 0.5.3 2022-07-18 [?] CRAN (R 4.2.0) #> P knitr 1.40 2022-08-24 [?] CRAN (R 4.2.0) #> P lifecycle 1.0.3 2022-10-07 [?] CRAN (R 4.2.0) #> P lubridate 1.8.0 2021-10-07 [?] CRAN (R 4.2.0) #> P magrittr 2.0.3 2022-03-30 [?] CRAN (R 4.2.0) #> P munsell 0.5.0 2018-06-12 [?] CRAN (R 4.2.0) #> P pillar 1.8.1 2022-08-19 [?] CRAN (R 4.2.0) #> P pkgconfig 2.0.3 2019-09-22 [?] CRAN (R 4.2.0) #> P progressr 0.11.0 2022-09-02 [?] CRAN (R 4.2.0) #> P purrr 0.3.5 2022-10-06 [?] CRAN (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [3] CRAN (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [3] CRAN (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [3] CRAN (R 4.2.0) #> R.utils 2.12.0 2022-06-28 [3] CRAN (R 4.2.0) #> P R6 2.5.1 2021-08-19 [?] CRAN (R 4.2.0) #> P Rcpp 1.0.9 2022-07-08 [?] CRAN (R 4.2.0) #> P reprex 2.0.2 2022-08-17 [?] CRAN (R 4.2.0) #> P rlang 1.0.6 2022-09-24 [?] CRAN (R 4.2.0) #> P rmarkdown 2.17 2022-10-07 [?] CRAN (R 4.2.0) #> P rstudioapi 0.14 2022-08-22 [?] CRAN (R 4.2.0) #> P scales 1.2.1 2022-08-20 [?] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [3] CRAN (R 4.2.0) #> P stringi 1.7.8 2022-07-11 [?] CRAN (R 4.2.0) #> P stringr 1.4.1 2022-08-20 [?] CRAN (R 4.2.0) #> styler 1.7.0 2022-03-13 [3] CRAN (R 4.2.0) #> P tibble 3.1.8 2022-07-22 [?] CRAN (R 4.2.0) #> P tidyr 1.2.1 2022-09-08 [?] CRAN (R 4.2.0) #> P tidyselect 1.2.0 2022-10-10 [?] CRAN (R 4.2.0) #> P tsibble * 1.1.3 2022-10-09 [?] CRAN (R 4.2.0) #> P utf8 1.2.2 2021-07-24 [?] CRAN (R 4.2.0) #> P vctrs 0.4.2 2022-09-29 [?] CRAN (R 4.2.0) #> P withr 2.5.0 2022-03-03 [?] CRAN (R 4.2.0) #> P xfun 0.34 2022-10-18 [?] CRAN (R 4.2.0) #> P yaml 2.3.6 2022-10-18 [?] CRAN (R 4.2.0) #> #> [1] /Users/ericscott/Documents/GitHub/azmet-qaqc/renv/library/R-4.2/x86_64-apple-darwin17.0 #> [2] /Users/ericscott/Documents/GitHub/azmet-qaqc/renv/sandbox/R-4.2/x86_64-apple-darwin17.0/84ba8b13 #> [3] /Library/Frameworks/R.framework/Versions/4.2/Resources/library #> #> P ── Loaded and on-disk path mismatch. #> #> ────────────────────────────────────────────────────────────────────────────── ```
mitchelloharawild commented 1 year ago

Thanks for raising this, I think this should be made to work better.

Essentially the features() function expects a function or list of functions. ~feat_stl(., .period = "1 year") is a formula representing a lambda function, and the current method doesn't (yet) know how to handle it.

Instead you could wrap it in a list, which is the recommended approach for providing one or more features.

tourism |> 
  features(Trips, list(~feat_stl(., .period = "1 year")))

However this runs into a secondary issue in that .period is not set up to handle the "1 year" syntax (yet).

So ultimately, what works is:

tourism |> 
  features(Trips, list(~feat_stl(., .period = 4)))

9a1e6fd90d37a9aa5be206e1f18218853af5d445 now supports lambda functions being used directly.

f8a8baab1e40f2978958fd02c35f42a43abc0cd2 allows you to use .period = "1 year" in features, but not directly in the feature functions like you are doing. You can change the .period that is passed to all features by using the ... of features().

So I would recommend (with these latest changes) for your example:

library(fpp3)
tourism |> 
  features(Trips, feat_stl, .period = "1 year")
#> # A tibble: 304 × 12
#>    Region  State Purpose trend…¹ seaso…² seaso…³ seaso…⁴ spiki…⁵ linea…⁶ curva…⁷
#>    <chr>   <chr> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 Adelai… Sout… Busine…   0.464   0.407       3       1 1.58e+2  -5.31   71.6  
#>  2 Adelai… Sout… Holiday   0.554   0.619       1       2 9.17e+0  49.0    78.7  
#>  3 Adelai… Sout… Other     0.746   0.202       2       1 2.10e+0  95.1    43.4  
#>  4 Adelai… Sout… Visiti…   0.435   0.452       1       3 5.61e+1  34.6    71.4  
#>  5 Adelai… Sout… Busine…   0.464   0.179       3       0 1.03e-1   0.968  -3.22 
#>  6 Adelai… Sout… Holiday   0.528   0.296       2       1 1.77e-1  10.5    24.0  
#>  7 Adelai… Sout… Other     0.593   0.404       2       2 4.44e-4   4.28    3.19 
#>  8 Adelai… Sout… Visiti…   0.488   0.254       0       3 6.50e+0  34.2    -0.529
#>  9 Alice … Nort… Busine…   0.534   0.251       0       1 1.69e-1  23.8    19.5  
#> 10 Alice … Nort… Holiday   0.381   0.832       3       1 7.39e-1 -19.6    10.5  
#> # … with 294 more rows, 2 more variables: stl_e_acf1 <dbl>, stl_e_acf10 <dbl>,
#> #   and abbreviated variable names ¹​trend_strength, ²​seasonal_strength_4,
#> #   ³​seasonal_peak_4, ⁴​seasonal_trough_4, ⁵​spikiness, ⁶​linearity, ⁷​curvature

or

library(fpp3)
tourism |> 
  features(Trips, ~feat_stl(., .period = 4))
#> # A tibble: 304 × 12
#>    Region  State Purpose trend…¹ seaso…² seaso…³ seaso…⁴ spiki…⁵ linea…⁶ curva…⁷
#>    <chr>   <chr> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 Adelai… Sout… Busine…   0.464   0.407       3       1 1.58e+2  -5.31   71.6  
#>  2 Adelai… Sout… Holiday   0.554   0.619       1       2 9.17e+0  49.0    78.7  
#>  3 Adelai… Sout… Other     0.746   0.202       2       1 2.10e+0  95.1    43.4  
#>  4 Adelai… Sout… Visiti…   0.435   0.452       1       3 5.61e+1  34.6    71.4  
#>  5 Adelai… Sout… Busine…   0.464   0.179       3       0 1.03e-1   0.968  -3.22 
#>  6 Adelai… Sout… Holiday   0.528   0.296       2       1 1.77e-1  10.5    24.0  
#>  7 Adelai… Sout… Other     0.593   0.404       2       2 4.44e-4   4.28    3.19 
#>  8 Adelai… Sout… Visiti…   0.488   0.254       0       3 6.50e+0  34.2    -0.529
#>  9 Alice … Nort… Busine…   0.534   0.251       0       1 1.69e-1  23.8    19.5  
#> 10 Alice … Nort… Holiday   0.381   0.832       3       1 7.39e-1 -19.6    10.5  
#> # … with 294 more rows, 2 more variables: stl_e_acf1 <dbl>, stl_e_acf10 <dbl>,
#> #   and abbreviated variable names ¹​trend_strength, ²​seasonal_strength_4,
#> #   ³​seasonal_peak_4, ⁴​seasonal_trough_4, ⁵​spikiness, ⁶​linearity, ⁷​curvature

Created on 2022-10-27 by the reprex package (v2.0.1)

Aariq commented 1 year ago

Thanks for the quick updates. Having examples like these in the documentation for maybe features() or feat_stl() would also be helpful.