r-lib / slider

Sliding Window Functions
https://slider.r-lib.org
Other
295 stars 12 forks source link

Access previous computations while sliding? #169

Closed ryantibs closed 2 years ago

ryantibs commented 2 years ago

Hi @DavisVaughan, I have a question that I wonder if you've thought about.

Suppose I want to do a sliding computation that needs access to previous computations. As an example, suppose I'm computing predictions using some kind of sliding regression, and I want to access test residuals (errors from past predictions) and correct my online predictions using the test residual distribution---like a kind of calibration step.

The current way I'd do this would be to call one of the slide() functions twice. The first time to make the predictions, and the second time to calibrate them.

My question is as follows: would it make sense to allow a slide() function to access a previous computation (from sliding)? This way it could all be done in one pass.

Of course, manual iteration with a for() loop affords you access to computations that were produced earlier in the loop (provided you're saving them in some pre-allocated vector). But I'm not aware of this being afforded by apply() functions, or plyr functions, or purrr functions, etc. Figured I would start by asking you since I'm currently using the slider package (which has been great in general). Thanks!

DavisVaughan commented 2 years ago

I think the best solution for you is to either call slide() twice as you are doing now, or to call slide() once to first generate the list of indices that you'd use to slice your data with at each iteration, and then use a for loop to iterate over that list of indices.

At each iteration you'd be able to look at the previous iteration's results and use that to adjust the result in this iteration.

I'd say it is a feature of the apply/purrr/slider family of functions that they generally don't let you look at any other computations. They are nicely self-contained, which is partially why furrr is even possible (being self contained means they are "embarrassingly parallel")

DavisVaughan commented 2 years ago

You could always use some kind of persistent object like an environment if you really wanted to do this in one call and felt comfortable with how environments work!

library(slider)

env <- new.env(parent = emptyenv())
env$previous <- NULL

slide(
  .x = 1:6, 
  .f = ~{
    previous <- env$previous
    env$previous <- .x
    list(current = .x, previous = previous)
  }, 
  .before = 3
)
#> [[1]]
#> [[1]]$current
#> [1] 1
#> 
#> [[1]]$previous
#> NULL
#> 
#> 
#> [[2]]
#> [[2]]$current
#> [1] 1 2
#> 
#> [[2]]$previous
#> [1] 1
#> 
#> 
#> [[3]]
#> [[3]]$current
#> [1] 1 2 3
#> 
#> [[3]]$previous
#> [1] 1 2
#> 
#> 
#> [[4]]
#> [[4]]$current
#> [1] 1 2 3 4
#> 
#> [[4]]$previous
#> [1] 1 2 3
#> 
#> 
#> [[5]]
#> [[5]]$current
#> [1] 2 3 4 5
#> 
#> [[5]]$previous
#> [1] 1 2 3 4
#> 
#> 
#> [[6]]
#> [[6]]$current
#> [1] 3 4 5 6
#> 
#> [[6]]$previous
#> [1] 2 3 4 5

Created on 2022-04-01 by the reprex package (v2.0.1)

ryantibs commented 2 years ago

Thanks @DavisVaughan. I'll stick with the simpler "slide twice" solution, and it's good to know I haven't missed anything with purrr or slider.

And thanks for pointing out the environment trick; good to know also. I'll close this issue now.