Open ddsjoberg opened 1 month ago
Just need to add this function
check_na_factor_levels <- function(data, variables) {
walk(
variables,
\(variable) {
if (is.factor(data[[variable]]) && any(is.na(levels(data[[variable]])))) {
cli::cli_abort(
"Factors with {.val {NA}} levels are not allowed, which are present in column {.val {variable}}.",
call = get_cli_abort_call()
)
}
}
)
}
I was thinking about a special (and annoying) case where factors have explicit levels, but that level does not have name. I think the most common case is when users may use
forcats::fct_na_value_to_level()
. We don't have an error when runningard_continuous()
BUT I think it'll throw a wrench into the shuffle functions (due to all the assumptions we make about NA values).What do you think we should do? I am fine with detecting a level without a name and returning an error. What do you think? @bzkrouse
Created on 2024-06-02 with reprex v2.1.0