EdwinTh / padr

Padding of missing records in time series
https://edwinth.github.io/padr/
Other
133 stars 12 forks source link

dplyr::group_by support #88

Open geotheory opened 2 years ago

geotheory commented 2 years ago

Is this officially supported or just an unintended bonus?

require(padr)
#> Loading required package: padr
require(tidyverse)
#> Loading required package: tidyverse

df = tibble(day = as.Date(c('2022-05-02', '2022-05-09', '2022-05-23', '2022-05-04', '2022-05-18', '2022-05-25')),
            y = c(3, 4, 6, 8, 3, 5), grp = rep(c('A','B'), each = 3))

df |> 
  dplyr::group_by(grp) |> 
  pad(interval = 'week') |> 
  fill_by_value(y) |> 
  print() |> 
  ggplot(aes(day, y, col = grp)) + 
  geom_line() + geom_point()
#> # A tibble: 8 × 3
#> # Groups:   grp [2]
#>   day            y grp  
#>   <date>     <dbl> <chr>
#> 1 2022-05-02     3 A    
#> 2 2022-05-09     4 A    
#> 3 2022-05-16     0 A    
#> 4 2022-05-23     6 A    
#> 5 2022-05-04     8 B    
#> 6 2022-05-11     0 B    
#> 7 2022-05-18     3 B    
#> 8 2022-05-25     5 B

image

EdwinTh commented 2 years ago

It is officially supported, you can either use the group argument or dplyr::group_by.

geotheory commented 2 years ago

Awesome sorry I couldn't find that Inn the documentation.

On Tue, 31 May 2022, 06:16 Edwin Thoen, @.***> wrote:

It is officially supported, you can either use the group argument or dplyr::group_by.

— Reply to this email directly, view it on GitHub https://github.com/EdwinTh/padr/issues/88#issuecomment-1141675977, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJZEFK7NGUVWCLHTJU3VR3VMWODLANCNFSM5XLU6H4A . You are receiving this because you authored the thread.Message ID: @.***>

geotheory commented 2 years ago

Also I note the group argument is evaluated, so you can use e.g. group = "paste(grp1, grp2)"for multiple variables. Might also be worth documenting :)

Below is the only ref to group_by I can find for padding. But as you say it's different in allowing group intervals, whereas my usage is directly equivalent to the group argument..

# applying pad with do, interval is determined individualle for each group
x %>% group_by(id) %>% do(pad(.))

Super useful package btw I use it a lot.

geotheory commented 2 years ago

OK I've just seen your blog on this! https://edwinth.github.io/blog/pad.v0.2.0/

EdwinTh commented 2 years ago

I’ll keep this issue open to see if I need to improve the docs.