Count duplicated values

At the moment this package only allows the validation of data on a row-level. There is #10, which adds a describe() function that also counts the unique values, but this is not a rule per se.

But you can use the !duplicated(var) rule to check if a variable has only unique values. Is this what you had in mind?

library(dataverifyr)

rs <- ruleset(
  rule(!duplicated(uniq)),
  rule(!duplicated(non_uniq))
)

data <- data.frame(
  uniq = 1:3,
  non_uniq = c(1, 1, 2)
)

check_data(data, rs)
#> # A tibble: 2 × 10
#>   name               expr    allow_na negate tests  pass  fail warn  error time 
#>   <chr>              <chr>   <lgl>    <lgl>  <int> <int> <int> <chr> <chr> <drt>
#> 1 Rule for: uniq     !dupli… FALSE    FALSE      3     3     0 ""    ""    0.00…
#> 2 Rule for: non_uniq !dupli… FALSE    FALSE      3     2     1 ""    ""    0.00…

^{Created on 2023-10-24 with reprex v2.0.2}

DavZim / dataverifyr

Count duplicated values #11