Closed Joe-Heffer-Shef closed 10 months ago
At the moment this package only allows the validation of data on a row-level.
There is #10, which adds a describe()
function that also counts the unique values, but this is not a rule per se.
But you can use the !duplicated(var)
rule to check if a variable has only unique values. Is this what you had in mind?
library(dataverifyr)
rs <- ruleset(
rule(!duplicated(uniq)),
rule(!duplicated(non_uniq))
)
data <- data.frame(
uniq = 1:3,
non_uniq = c(1, 1, 2)
)
check_data(data, rs)
#> # A tibble: 2 × 10
#> name expr allow_na negate tests pass fail warn error time
#> <chr> <chr> <lgl> <lgl> <int> <int> <int> <chr> <chr> <drt>
#> 1 Rule for: uniq !dupli… FALSE FALSE 3 3 0 "" "" 0.00…
#> 2 Rule for: non_uniq !dupli… FALSE FALSE 3 2 1 "" "" 0.00…
Created on 2023-10-24 with reprex v2.0.2
Is there a way to count the number of duplicated values for a column? Or is this package just for validating individual cells in a data frame?
For example, I'd like to be able to define a rule that has a result like this: