nathaneastwood / poorman

A poor man's dependency free grammar of data manipulation
https://nathaneastwood.github.io/poorman/
Other
340 stars 15 forks source link

Implement n_distinct() #16

Closed msberends closed 4 years ago

msberends commented 4 years ago

n_distinct() not only supports vectors, it also supports data.frames and then returns the number of unique rows.

Suggestion for your pkg:

n_distinct <- function(..., na.rm = FALSE) {
  out <- c(...)
  if (is.list(out)) {
    return(NROW(unique(as.data.frame(out, stringsAsFactors = FALSE))))
  }
  if (isTRUE(na.rm)) {
    out <- out[!is.na(out)]
  }
  length(unique(out))
}

Testing:

dplyr::n_distinct(mtcars)
#> [1] 32
n_distinct(mtcars)
#> [1] 32

test_vector <- c("A", "B", "A", "B")
dplyr::n_distinct(test_vector)
#> [1] 2
n_distinct(test_vector)
#> [1] 2

test_data.frame1 <- data.frame(test = c("A", "A"), test2 = c("B", "B"))
dplyr::n_distinct(test_data.frame1)
#> [1] 1
n_distinct(test_data.frame1)
#> [1] 1

test_data.frame2 <- data.frame(test = c("A", "A"), test2 = c("B", "C"))
dplyr::n_distinct(test_data.frame2)
#> [1] 2
n_distinct(test_data.frame2)
#> [1] 2