njtierney / brolgar

BRowse Over Longitudinal Data Graphically and Analytically in R
http://brolgar.njtierney.com/
Other
109 stars 10 forks source link

Add `feat_diff_summary` #100

Closed njtierney closed 4 years ago

njtierney commented 4 years ago

provide better handling of differences. This can be useful when looking at longitudinal data and you want to summarise the intervals around the measurement period

library(tidyverse)
library(brolgar)

currently you need to do something like this to get summaries of the diff

heights %>% 
  features(year, diff) %>%
  group_by(country) %>%
  mutate(max_diff = max(c_across(everything()), na.rm = TRUE),
         min_diff = min(c_across(everything()), na.rm = TRUE)) %>% 
  select(-starts_with("...")) 
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> * `` -> ...3
#> * `` -> ...4
#> * `` -> ...5
#> * ...
#> # A tibble: 144 x 3
#> # Groups:   country [144]
#>    country     max_diff min_diff
#>    <chr>          <dbl>    <dbl>
#>  1 Afghanistan       60       10
#>  2 Albania          100       10
#>  3 Algeria           60       10
#>  4 Angola            70       10
#>  5 Argentina         40       10
#>  6 Armenia           30       10
#>  7 Australia         40       10
#>  8 Austria           40       10
#>  9 Azerbaijan        90       10
#> 10 Bahrain           10       10
#> # … with 134 more rows

but we can use features to help us!

b_diff_max <- function(x, na.rm = TRUE, ...){
  max(diff(x, na.rm = na.rm, ...))
}

b_diff_min <- function(x, na.rm = TRUE, ...){
  min(diff(x, na.rm = na.rm, ...))
}

heights %>% 
  features(year, lst(b_diff_max,
                      b_diff_min))
#> # A tibble: 144 x 3
#>    country     b_diff_max b_diff_min
#>    <chr>            <dbl>      <dbl>
#>  1 Afghanistan         60         10
#>  2 Albania            100         10
#>  3 Algeria             60         10
#>  4 Angola              70         10
#>  5 Argentina           40         10
#>  6 Armenia             30         10
#>  7 Australia           40         10
#>  8 Austria             40         10
#>  9 Azerbaijan          90         10
#> 10 Bahrain             10         10
#> # … with 134 more rows

the goal will be then to add feat_diff_summary so you can summarise the diffs. Here is a starting point …might need to remove the na.rm parts?

b_diff_var <- function(x, na.rm = TRUE, ...){
  var(diff(x, na.rm = na.rm, ...))
}

b_diff_sd <- function(x, na.rm = TRUE, ...){
  sd(diff(x, na.rm = na.rm, ...))
}

b_diff_mean <- function(x, na.rm = TRUE, ...){
  mean(diff(x, na.rm = na.rm, ...))
}

b_diff_median <- function(x, na.rm = TRUE, ...){
  median(diff(x, na.rm = na.rm, ...))
}

b_diff_q25 <- function(x, na.rm = TRUE, ...){
  b_q25(diff(x, na.rm = na.rm, ...))
}

b_diff_q75 <- function(x, na.rm = TRUE, ...){
  b_q75(diff(x, na.rm = na.rm, ...))
}

Created on 2020-09-04 by the reprex package (v0.3.0)

njtierney commented 4 years ago

resolved in #101