r-lib / slider

Sliding Window Functions
https://slider.r-lib.org
Other
295 stars 12 forks source link

duplicated indices/dates #196

Closed Steviey closed 9 months ago

Steviey commented 9 months ago

It seems not to deliver with duplicated indices/dates. Neither slide_periode_dfr() nor slide_index_dfr(). Always strange results/behaviors as with runner::runner().

DavisVaughan commented 9 months ago

Could you please turn this into a self-contained reprex (short for minimal reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page.

You can install reprex by running (you may already have it, though, if you have the tidyverse package installed):

install.packages("reprex")

Thanks

DavisVaughan commented 9 months ago

Duplicates are working as I expect them to, maybe this is surprising to you but it is the only reasonable behavior I could come up with, and I think it matches SQL (they use the term "peers" for duplicates)

library(slider)

index <- c(1, 1, 2, 3, 3, 3, 4, 5)

# With duplicated indices, we aggregate the duplicates together
# and treat them like 1 index. So the actual unique values are:
# x: (1, 1) (2) (3, 3, 3) (4) (5)
# i: 1       2   3         4   5
# But then after we evaluate the function calls, we map the results
# back to their original location to ensure that the length of
# the input is the same as the length of the output.

slide_index(index, index, identity)
#> [[1]]
#> [1] 1 1
#> 
#> [[2]]
#> [1] 1 1
#> 
#> [[3]]
#> [1] 2
#> 
#> [[4]]
#> [1] 3 3 3
#> 
#> [[5]]
#> [1] 3 3 3
#> 
#> [[6]]
#> [1] 3 3 3
#> 
#> [[7]]
#> [1] 4
#> 
#> [[8]]
#> [1] 5

# Here is proof that for (1, 1) we only evaluate the function once,
# but then insert the result into slots 1 and 2 of the result
slide_index(index, index, ~rnorm(length(.x)))
#> [[1]]
#> [1] -0.49264648  0.01369336
#> 
#> [[2]]
#> [1] -0.49264648  0.01369336
#> 
#> [[3]]
#> [1] -0.4389042
#> 
#> [[4]]
#> [1] 0.8276602 0.8240620 0.1718141
#> 
#> [[5]]
#> [1] 0.8276602 0.8240620 0.1718141
#> 
#> [[6]]
#> [1] 0.8276602 0.8240620 0.1718141
#> 
#> [[7]]
#> [1] -0.7381693
#> 
#> [[8]]
#> [1] -0.384347

slide_index(index, index, identity, .before = 1)
#> [[1]]
#> [1] 1 1
#> 
#> [[2]]
#> [1] 1 1
#> 
#> [[3]]
#> [1] 1 1 2
#> 
#> [[4]]
#> [1] 2 3 3 3
#> 
#> [[5]]
#> [1] 2 3 3 3
#> 
#> [[6]]
#> [1] 2 3 3 3
#> 
#> [[7]]
#> [1] 3 3 3 4
#> 
#> [[8]]
#> [1] 4 5

Created on 2023-10-11 with reprex v2.0.2