GerkeLab / gerkelab-com

Website source for gerkelab.com
http://www.gerkelab.com
0 stars 3 forks source link

Post: Passing the dots to a function inside a pipe chain #26

Open gadenbuie opened 5 years ago

gadenbuie commented 5 years ago

Just ran into this problem and made a reprex to figure it out. Wouldn't take much to turn it into a quick blog post about the value of purrr::partial().

library(purrr)
library(tidyr)
library(dplyr)

# NA-ify some populations
population[sample(seq_len(nrow(population)), 100), "population"] <- NA

This is a simple function abstracting a data processing pipe but where we want to pass the ... to a function used inside an anonymous function.

dot_passer <- function(df, ...) {
  df %>% 
    nest(-country) %>% 
    mutate(pop = map_dbl(data, ~ mean(.$population, ...)))
}

This doesn’t work and it gives you a completely bonkers error message. Who said anything about trim?

population %>% 
  dot_passer(na.rm = TRUE)
#> Error in mean.default(.$population, ...): 'trim' must be numeric of length one

But you're convinced this should work because if you put na.rm in the pipe chain directly it works.

population %>% 
  nest(-country) %>% 
  mutate(pop = map_dbl(data, ~ mean(.$population, na.rm = TRUE)))
#> # A tibble: 219 x 3
#>    country             data                    pop
#>    <chr>               <list>                <dbl>
#>  1 Afghanistan         <tibble [19 × 2]> 23703086.
#>  2 Albania             <tibble [19 × 2]>  3235417 
#>  3 Algeria             <tibble [19 × 2]> 34047477.
#>  4 American Samoa      <tibble [19 × 2]>    56705.
#>  5 Andorra             <tibble [19 × 2]>    74590.
#>  6 Angola              <tibble [19 × 2]> 16275446.
#>  7 Anguilla            <tibble [19 × 2]>    12246.
#>  8 Antigua and Barbuda <tibble [19 × 2]>    80887.
#>  9 Argentina           <tibble [19 × 2]> 38252583.
#> 10 Armenia             <tibble [19 × 2]>  3042757.
#> # … with 209 more rows

The ... are not what we think they are inside an anonymous function. If we create a partial function and pass the dots first, outside of the anonymous function, they get where they need to go.

dot_partial_passer <- function(df, ...) {
  p_mean <- partial(mean, ...)
  df %>% 
    nest(-country) %>% 
    mutate(pop = map_dbl(data, ~ p_mean(.$population)))
}

population %>% 
  dot_partial_passer(na.rm = TRUE)
#> # A tibble: 219 x 3
#>    country             data                    pop
#>    <chr>               <list>                <dbl>
#>  1 Afghanistan         <tibble [19 × 2]> 23703086.
#>  2 Albania             <tibble [19 × 2]>  3235417 
#>  3 Algeria             <tibble [19 × 2]> 34047477.
#>  4 American Samoa      <tibble [19 × 2]>    56705.
#>  5 Andorra             <tibble [19 × 2]>    74590.
#>  6 Angola              <tibble [19 × 2]> 16275446.
#>  7 Anguilla            <tibble [19 × 2]>    12246.
#>  8 Antigua and Barbuda <tibble [19 × 2]>    80887.
#>  9 Argentina           <tibble [19 × 2]> 38252583.
#> 10 Armenia             <tibble [19 × 2]>  3042757.
#> # … with 209 more rows

Finally, this might also work because the dots aren’t inside an anonymous function, but it’s a little confusing or could be difficult to set up in more complicated situations.

dot_passer2 <- function(df, ...) {
  df %>% 
    nest(-country) %>% 
    mutate(pop = map(data, "population") %>% map_dbl(mean, ...))
}

population %>% 
  dot_passer2(na.rm = TRUE)
#> # A tibble: 219 x 3
#>    country             data                    pop
#>    <chr>               <list>                <dbl>
#>  1 Afghanistan         <tibble [19 × 2]> 23703086.
#>  2 Albania             <tibble [19 × 2]>  3235417 
#>  3 Algeria             <tibble [19 × 2]> 34047477.
#>  4 American Samoa      <tibble [19 × 2]>    56705.
#>  5 Andorra             <tibble [19 × 2]>    74590.
#>  6 Angola              <tibble [19 × 2]> 16275446.
#>  7 Anguilla            <tibble [19 × 2]>    12246.
#>  8 Antigua and Barbuda <tibble [19 × 2]>    80887.
#>  9 Argentina           <tibble [19 × 2]> 38252583.
#> 10 Armenia             <tibble [19 × 2]>  3042757.
#> # … with 209 more rows