tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.78k stars 2.12k forks source link

Preserver groups even when variable is not a factor? #4195

Closed will458 closed 5 years ago

will458 commented 5 years ago

Hi,

Is there any chance that the preserving empty groups feature will be available when grouping by non factor variables?

Here is a reproducible example of what I mean:

df <- tibble(
    f1 = c("a", "a", "a", "b", "b"), 
    f2 = factor(c("d", "e", "d", "e", "f"), levels = c("d", "e", "f")), 
    x  = c(1, 1, 1, 2, 2), 
    y  = 1:5
)
df %>% count(f1)

vs.

df <- tibble(
    f1 = factor(c("a", "a", "a", "b", "b"), levels = c("a", "b", "c")), 
    f2 = factor(c("d", "e", "d", "e", "f"), levels = c("d", "e", "f")), 
    x  = c(1, 1, 1, 2, 2), 
    y  = 1:5
)
df %>% count(f1)

Notice how f1 is a character in the first case while a factor in the second. count() only returns the empty group in the second example.

romainfrancois commented 5 years ago

I don't understand what you mean, how is dplyr going to come up with the existence of group "c" in the first example ?

Also, if you want the empty groups, you have to explicitly use .drop = FALSE in the second example:

library(dplyr, warn.conflicts = FALSE)

df <- tibble(
  f1 = factor(c("a", "a", "a", "b", "b"), levels = c("a", "b", "c")), 
  f2 = factor(c("d", "e", "d", "e", "f"), levels = c("d", "e", "f")), 
  x  = c(1, 1, 1, 2, 2), 
  y  = 1:5
)
df %>% 
  count(f1, .drop = FALSE)
#> # A tibble: 3 x 2
#>   f1        n
#>   <fct> <int>
#> 1 a         3
#> 2 b         2
#> 3 c         0

Created on 2019-02-20 by the reprex package (v0.2.1.9000)

lock[bot] commented 5 years ago

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/