r-lib / slider

Sliding Window Functions
https://slider.r-lib.org
Other
296 stars 12 forks source link

slide and dplyr::group_by #193

Closed talegari closed 1 year ago

talegari commented 1 year ago

Hi Davis Vaughan,

I am attempting a sliding window operation on a the grouped dataframe where slide is expected to be applied on each chunk. My current code looks like this involving tidyr::nest:

iris %>% 
  tidyr::nest(.by = Species) %>% 
  dplyr::mutate(data = 
           purrr::map(data,
               ~ slider::slide_dbl(dplyr::arrange(.x, Petal.Width),
                                   \(x) sum(x$Sepal.Length) + mean(x$Sepal.Width),
                                   .before = Inf,
                                   .after = 0
                                   )
               )
         ) %>% 
  tidyr::unnest(data)

Is there a better workflow with group_by where we could do something like this if the context of current chunk is available (say using get_chunk):

iris %>% 
  mutate(result = 
           slider::slide_dbl(dplyr::arrange(get_chunk(), Petal.Width),
                             \(x) sum(x$Sepal.Length) + mean(x$Sepal.Width),
                             .before = Inf,
                             .after = 0
                             ),
         .by = Species
         ) 

PS: Let me know if question should be in dplyr and not slider, I will post it there.

DavisVaughan commented 1 year ago

Doesn't look like you need nesting at all!

library(tidyverse)

my_helper <- function(x, y) {
  sum(x) + mean(y)
}

new <- iris %>%
  as_tibble() %>%
  arrange(Species, Petal.Width) %>%
  mutate(
    value = slider::slide2_dbl(
      .x = Sepal.Length,
      .y = Sepal.Width,
      .f = my_helper,
      .before = Inf
    ),
    .by = Species
  )

old <- iris %>% 
  tidyr::nest(.by = Species) %>% 
  dplyr::mutate(data = 
                  purrr::map(data,
                             ~ slider::slide_dbl(dplyr::arrange(.x, Petal.Width),
                                                 \(x) sum(x$Sepal.Length) + mean(x$Sepal.Width),
                                                 .before = Inf,
                                                 .after = 0
                             )
                  )
  ) %>% 
  tidyr::unnest(data)

identical(new$value, old$data)
#> [1] TRUE

Created on 2023-02-20 with reprex v2.0.2.9000