tidyverse / dplyr

dplyr: A grammar of data manipulation
https://dplyr.tidyverse.org/
Other
4.79k stars 2.12k forks source link

Documentation error for count #7053

Closed gevro closed 4 months ago

gevro commented 4 months ago

Hi, There is an error in this documentation: https://dplyr.tidyverse.org/reference/count.html

It says it 'count the unique values'. But that isn't true. It is just counting the number of rows. n_distinct counts unique values.

DavisVaughan commented 4 months ago

I think "count the unique values of one or more variables" is exactly what it does. If you say count(df, x) and x is made of c("a", "b", "a") then you get a count of 2 for a and 1 for b

gevro commented 4 months ago

Thanks. In that case, how is it different from n_distinct() ?

DavisVaughan commented 4 months ago

n_distinct() can be used inside a mutate() or summarise() and be generated alongside other columns. count() is a top level verb that always produces a single kind of output - the group variables followed by an n column

gevro commented 4 months ago

Thanks for explaining!