cmu-delphi / epiprocess

Tools for basic signal processing in epidemiology
https://cmu-delphi.github.io/epiprocess/
Other
13 stars 9 forks source link

Merge `epi_slide_opt` with `epi_slide`? #546

Open dshemetov opened 4 days ago

dshemetov commented 4 days ago

' The optimized data.table and slider functions can't be directly passed

' as the computation function in epi_slide without careful handling to make

' sure each computation group is made up of the .window_size dates rather

' than .window_size points. epi_slide_opt (and wrapper functions

' epi_slide_mean and epi_slide_sum) take care of window completion

' automatically to prevent associated errors.

Now that epi_slide does this too, we might be able to merge these function in? Not super high priority, but a possibility.

brookslogan commented 4 days ago

This part of the comment doesn't fully make sense, because epi_slide passes individual group x time windows to the functions (unlike mutate_scattered_ts, which passes group x (completed) time series; if I wrote that comment I was probably thinking of that function instead).

(The completion part does make sense and we have standardized the completion as noted above.)

epi_slide_opt also has another difference in that it is using across-like tidyselect + fun, rather than providing a data-masking expression or function.

A tidyverse-inspired way to do this would be to add across support to epi_slide and detect functions like sum, mean, min, etc., and instead of doing the normal slow version of things, instead call the fast versions instead. Limiting this detection to across usage would likely make it easier to implement, though dplyr is able to inspect data-masking expressions as well when it's determining whether to use its specialized fast-grouped-sum operations under the hood.