tidyverse / funs

Collection of low-level functions for working with vctrs
Other
34 stars 7 forks source link

Quantile variant that returns a tibble #24

Open hadley opened 5 years ago

hadley commented 5 years ago
tibble::as_tibble(as.list(quantile(1:5)))
#> # A tibble: 1 x 5
#>    `0%` `25%` `50%` `75%` `100%`
#>   <dbl> <dbl> <dbl> <dbl>  <dbl>
#> 1     1     2     3     4      5

Created on 2019-02-08 by the reprex package (v0.2.1.9000)

Will need to think carefully about how the columns should be named.

DavisVaughan commented 4 years ago

A two column tidy tibble may make sense

# devtools::install_github("tidyverse/dplyr", ref = "across_simpler")
library(dplyr, warn.conflicts = FALSE)
library(tibble)
library(tidyr)

tidy_quantile <- function(x) {
  enframe(quantile(x, probs = c(0, .5, 1)), name = "percentile")
}

# ---

# One column
iris %>%
  group_by(Species) %>%
  summarise(tidy_quantile(Sepal.Length))
#> # A tibble: 9 x 3
#>   Species    percentile value
#>   <fct>      <chr>      <dbl>
#> 1 setosa     0%           4.3
#> 2 setosa     50%          5  
#> 3 setosa     100%         5.8
#> 4 versicolor 0%           4.9
#> 5 versicolor 50%          5.9
#> 6 versicolor 100%         7  
#> 7 virginica  0%           4.9
#> 8 virginica  50%          6.5
#> 9 virginica  100%         7.9

# Multiple columns with a return value in the most useful format
iris %>%
  pivot_longer(-Species, names_to = "measure") %>%
  group_by(Species, measure) %>%
  summarise(tidy_quantile(value))
#> # A tibble: 36 x 4
#> # Groups:   Species [3]
#>    Species measure      percentile value
#>    <fct>   <chr>        <chr>      <dbl>
#>  1 setosa  Petal.Length 0%           1  
#>  2 setosa  Petal.Length 50%          1.5
#>  3 setosa  Petal.Length 100%         1.9
#>  4 setosa  Petal.Width  0%           0.1
#>  5 setosa  Petal.Width  50%          0.2
#>  6 setosa  Petal.Width  100%         0.6
#>  7 setosa  Sepal.Length 0%           4.3
#>  8 setosa  Sepal.Length 50%          5  
#>  9 setosa  Sepal.Length 100%         5.8
#> 10 setosa  Sepal.Width  0%           2.3
#> # … with 26 more rows

Created on 2019-11-09 by the reprex package (v0.3.0.9000)

hadley commented 4 years ago

Could possibly have quantile_wide() and quantile_long()? Or have some special data frame subclass so that you could could t() it?