r-lib / slider

Sliding Window Functions
https://slider.r-lib.org
Other
296 stars 12 forks source link

Padding problems when running slider over multiple columns #162

Closed heinonmatti closed 2 years ago

heinonmatti commented 2 years ago

Hi,

Seems like complete = TRUE doesn't work when the function is ran through dplyr::across() or purrr::map(), or have I misunderstood how it should operate?

Reprex:

data(iris)

iris %>% 
  dplyr::select(-Species) %>% 
  purrr::map(.x = ., 
             .f = ~slider::slide_dbl(
               .x = ., .before = 10, .after = 0, complete = TRUE,
               .f = ~var(.x)
             )
  )

iris %>% 
  dplyr::mutate(across(-Species,
                       ~slider::slide_dbl(
                         .x = ., 
                         .f = ~var(.x),
                         .before = 10, 
                         .after = 0, 
                         complete = TRUE)))
DavisVaughan commented 2 years ago

I'm not sure what you mean. These give identical results for me.

library(tidyverse)
library(slider)

res1 <- iris %>% 
  dplyr::select(-Species) %>% 
  purrr::map(.x = ., 
             .f = ~slider::slide_dbl(
               .x = ., .before = 10, .after = 0, complete = TRUE,
               .f = ~var(.x)
             )
  )

res2 <- iris %>% 
  dplyr::mutate(across(-Species,
                       ~slider::slide_dbl(
                         .x = ., 
                         .f = ~var(.x),
                         .before = 10, 
                         .after = 0, 
                         complete = TRUE)))

res2$Species <- NULL
res2 <- as.list(res2)

identical(res1, res2)
#> [1] TRUE

Created on 2022-01-04 by the reprex package (v2.0.1)

DavisVaughan commented 2 years ago

Oh were you expecting this first value to not be NA? That's just because var() with a single value returns NA because you can't compute the variance of a single value.

library(tidyverse)
library(slider)

res <- iris %>% 
  dplyr::select(-Species) %>% 
  purrr::map(.x = ., 
             .f = ~slider::slide_dbl(
               .x = ., .before = 10, .after = 0, complete = TRUE,
               .f = ~var(.x)
             )
  )

head(res[[1]])
#> [1]         NA 0.02000000 0.04000000 0.04916667 0.04300000 0.08300000

var(1)
#> [1] NA

Created on 2022-01-04 by the reprex package (v2.0.1)

heinonmatti commented 2 years ago

Yes, but I thought the first 10 values should be NA (as per .before = 10 and complete = true), as happens when I run the function on a vector instead of several columns?

Matti

-

On Tue, 4 Jan 2022, 23.37 Davis Vaughan, @.***> wrote:

Oh were you expecting this first value to not be NA? That's just because var() with a single value returns NA because you can't compute the variance of a single value.

library(tidyverse) library(slider) res <- iris %>% dplyr::select(-Species) %>% purrr::map(.x = ., .f = ~slider::slide_dbl( .x = ., .before = 10, .after = 0, complete = TRUE, .f = ~var(.x) ) )

head(res[[1]])#> [1] NA 0.02000000 0.04000000 0.04916667 0.04300000 0.08300000

var(1)#> [1] NA

Created on 2022-01-04 by the reprex package https://reprex.tidyverse.org (v2.0.1)

— Reply to this email directly, view it on GitHub https://github.com/DavisVaughan/slider/issues/162#issuecomment-1005188934, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD45WISWAALZCQ6KF6NIOITUUNSCDANCNFSM5LIL3OVA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

DavisVaughan commented 2 years ago

Oh, duh. It's .complete not complete

library(tidyverse)
library(slider)

res <- iris %>% 
  dplyr::select(-Species) %>% 
  purrr::map(.x = ., 
             .f = ~slider::slide_dbl(
               .x = ., .before = 10, .after = 0, .complete = TRUE,
               .f = ~var(.x)
             )
  )

head(res[[1]], n = 12)
#>  [1]         NA         NA         NA         NA         NA         NA
#>  [7]         NA         NA         NA         NA 0.10290909 0.09963636

Created on 2022-01-04 by the reprex package (v2.0.1)

DavisVaughan commented 2 years ago

Note that if you would have used var rather than ~var(.x) you would have caught this sooner, but because of the way anonymous functions work, the complete was silently swallowed

library(tidyverse)
library(slider)

res <- iris %>% 
  dplyr::select(-Species) %>% 
  purrr::map(.x = ., 
             .f = ~slider::slide_dbl(
               .x = ., .before = 10, .after = 0, complete = TRUE,
               .f = var
             )
  )
#> Error in .f(.x, ...): unused argument (complete = TRUE)

Created on 2022-01-04 by the reprex package (v2.0.1)

heinonmatti commented 2 years ago

Oof -- ok, sorry for the trouble and many thanks!