JeffreySarnoff / RollingFunctions.jl

Roll a window over data; apply a function over the window.
MIT License
114 stars 6 forks source link

Add support for padding with sentinel value for roll* #24

Open bkamins opened 1 year ago

bkamins commented 1 year ago

Example:

julia> rollmean(1:5, 3)
3-element Vector{Float64}:
 2.0
 3.0
 4.0

julia> runmean(1:5, 3)
5-element Vector{Float64}:
 1.0
 1.5
 2.0
 3.0
 4.0

julia> [fill(missing, 2); rollmean(1:5, 3)]
5-element Vector{Union{Missing, Float64}}:
  missing
  missing
 2.0
 3.0
 4.0

It would be nice to add a kwarg to roll* to allow such padding, e.g. by writing rollmean(1:5, 3; pad=missing).

bkamins commented 1 year ago

Otherwise, it becomes difficult to know the data vectors from the span vector (without always making the span a kwarg -- which we discussed above)

This is the reason I preferred kwarg over positional arg as I believed it is more flexible for future development.

JeffreySarnoff commented 1 year ago

I will allow an Int and allow NTuple{N,Int} for window_span without implementing the NTuple version yet.

BeitianMa commented 1 year ago

In the future, will the keyword of padding be applied to rolling functions in general? As I do research in asset pricing, I often run rolling regressions of stock returns on certain other variables, and it would be handy if the following procedure worked

beta = rolling(get_beta, df.return, df.char, rolling_window, padding=missing)

But despite now needing to manually add padding values, your package has already given me a 20x+ speedup over Python's rolling.apply() solution, so thank you very much for your work!

JeffreySarnoff commented 1 year ago

@BeitianMa yes and it is working in the dev v1 Pkg.add(url="https://github.com/JeffreySarnoff/RollingFunctions.jl",rev="v1") maybe like this beta = rolling(get_beta, rolling_window, df.return, df.char; padding=missing) JUST FOR TESTING I assume your get_beta works this way get_beta(df.return[..], df.char[..])

JeffreySarnoff commented 1 year ago

@bkamins I have settled on an implementation pattern

rolling(func, span, datavec1; padding<=nopadding>, padlast<=false>) and, where weights<:StatsBase.AbstractWeights rolling(func, span, datavec1, weights1; padding<=nopadding>, padlast<=false>) are passing their tests, however more comprehensive tests are indicated

JeffreySarnoff commented 1 year ago

On development version 0.9.75 v1 adjacent Test coverage is decent and all tests pass.

the image of v1 image

Next is to comb the docs into relevance. meanwhile:

# rolling advances one step at a time

rolling(func, window_span, data_vec [, data_vec2[, data_vec3]] 
           [, StatsBase.weights [,weights2[, weights3]] ];
           padding=nopadding, atend=false)

rolling(func, window_span, data_mat
            [, StatsBase.weights [...]]; 
            padding=nopadding, atend=false)
#          func is applied to each [[individually] weighted] column of the data_mat

# tiling gulps span elements and then advanced over those same elements,
#    to gulp the successor span elements ..

tiling(func, window_span_is_tile_span, data_vec [, data_vec2[, data_vec3]] 
           [, StatsBase.weights [,weights2[, weights3]] ];
           padding=nopadding, atend=false)

tiling(func, window_span_is_tile_span, data_mat 
            [, StatsBase.weights [...]]; 
           padding=nopadding, atend=false)
#         func is applied to each [[individually] weighted] column of the data_mat

# running tapers the function at the start or at the end of the data

running(func, window_span, data_vec [, data_vec2[, data_vec3]] 
              [, StatsBase.weights [,weights2[, weights3]] ];
             atend=false)

running(func, window_span, data_mat \
             [, StatsBase.weights [...]];
             atend = false)
#           func is applied to each [[individually] weighted] column of the data_mat

[roll|tile|run]_
where _ is {min, max, extrema, mean, sum, var, std, cov, cor}
JeffreySarnoff commented 12 months ago

Early access to RollingFunctions.jl v1 is given as WindowedFunctions.jl. This version does more, has padding, and provides tiling. (see the docs).

add the package
Pkg.add(url="https://github.com/JeffreySarnoff/WindowedFunctions.jl")

This version is radically redesigned. The new approach offers the most requested capabilities, provides much flexibility, and performs well. All of the widely used call signatures will remain supported for many months following the official release of v1, identified as deprecated calls with detailed information about how to revise them.

It would be a great help if you would take this prerelease for a walk. Any issues or suggestions should be posted here.