Closed wtimmerman-fitp closed 2 years ago
Based on a quick read, I think you might be interested in fct()
? More in #299.
Oh, this is perfect! Thank you for the pointer! I think this will solve my issue. level named argument is there, no errors or warnings if an additional level is listed but not in data, errors (unlike base::factor) if one of the supplied levels is not in the data.
I'll close the issue and look forward to fct() getting into a future release.
(example below if anyone curious).
#setup ----
library(tidyverse)
fct <- function(x = character(), levels = NULL, na = character()) {
if (!is.character(x)) {
cli::cli_abort("{.arg x} must be a character vector")
}
if (!is.character(na)) {
cli::cli_abort("{.arg na} must be a character vector")
}
x[x %in% na] <- NA
if (is.null(levels)) {
levels <- unique(x)
} else if (!is.character(levels)) {
abort("`{.arg levels} must be a character vector")
}
invalid <- setdiff(x, c(levels, NA))
if (length(invalid) > 0 ) {
cli::cli_abort(c(
"Values of {.arg x} must be members of {.arg levels}",
i = "Invalid value{?s}: {.str {invalid}}"
))
}
factor(x, levels = levels, exclude = NULL)
}
mtcars2 <-
mtcars %>%
tibble::rownames_to_column(var = "make_model") %>%
dplyr::filter(
dplyr::row_number() <= 5
)
# Match levels----
match_levels <-
mtcars2 %>%
dplyr::pull(make_model)
mtcars2_factor <-
mtcars2 %>%
dplyr::mutate(
make_model = base::factor(
make_model,
levels = match_levels
)
)
mtcars2_fct <-
mtcars2 %>%
dplyr::mutate(
make_model = fct(
make_model,
levels = match_levels
)
)
# Add Levels ----
add_levels <-
c(match_levels, "Other Car")
mtcars2_add_factor <-
mtcars2 %>%
dplyr::mutate(
make_model = base::factor(
make_model,
levels = add_levels
)
)
mtcars2_add_fct <-
mtcars2 %>%
dplyr::mutate(
make_model = fct(
make_model,
levels = add_levels
)
)
levels(mtcars2_add_fct$make_model)
#> [1] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710"
#> [4] "Hornet 4 Drive" "Hornet Sportabout" "Other Car"
# Miss Levels ----
miss_levels <-
match_levels[-1]
mtcars2_miss_factor <-
mtcars2 %>%
dplyr::mutate(
make_model = base::factor(
make_model,
levels = miss_levels
)
)
mtcars2_miss_fct <-
mtcars2 %>%
dplyr::mutate(
make_model = fct(
make_model,
levels = miss_levels
)
)
#> Error in `dplyr::mutate()`:
#> ! Problem while computing `make_model = fct(make_model, levels =
#> miss_levels)`.
#> Caused by error in `fct()`:
#> ! Values of `x` must be members of `levels`
#> i Invalid value: "Mazda RX4"
Created on 2022-08-09 by the reprex package (v2.0.1)
Also, if anyone runs into the same warning I got with fct_relevel (Warning: Outer names are only allowed for unnamed scalar atomic inputs), it's because you can't use the levels argument for that function; just pass the vector object of level names (in this case, use_levels) into the ellipsis on its own like:
mtcars2_fct_relevel <-
mtcars2 %>%
dplyr::mutate(
make_model = forcats::fct_relevel(
make_model,
use_levels
)
)
When I use fct_relevel with the levels argument, I receive a warning that does not clearly indicate what is going wrong. Similarly, when I use the levels argument in forcats::as_factor()'s, (on the assumptions that arguments in .../ellipsis will be passed on to methods), I receive an error "Arguments in
...
must be used". Both of these are unexpected results for me based on my understanding of the function help text and base::factor().For background, my intention is to convert a character column into a factor column using a pre-specified list of levels (the pre-specified list is somewhat important as a check and consistency for reasons that I won't get into here). I have reviewed the forcats issues and don't see an exact match for this problem:
...
must be used." I am not clear if I am misusing the function.My questions are:
Reprex
Created on 2022-08-09 by the reprex package (v2.0.1)
Session info
``` r sessionInfo() #> R version 4.0.5 (2021-03-31) #> Platform: x86_64-w64-mingw32/x64 (64-bit) #> Running under: Windows 10 x64 (build 19043) #> #> Matrix products: default #> #> locale: #> [1] LC_COLLATE=English_United States.1252 #> [2] LC_CTYPE=English_United States.1252 #> [3] LC_MONETARY=English_United States.1252 #> [4] LC_NUMERIC=C #> [5] LC_TIME=English_United States.1252 #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages: #> [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.9 purrr_0.3.4 #> [5] readr_2.1.2 tidyr_1.2.0 tibble_3.1.8 ggplot2_3.3.6 #> [9] tidyverse_1.3.2 #> #> loaded via a namespace (and not attached): #> [1] tidyselect_1.1.2 xfun_0.31 haven_2.5.0 #> [4] gargle_1.2.0 colorspace_2.0-3 vctrs_0.4.1 #> [7] generics_0.1.3 htmltools_0.5.3 yaml_2.3.5 #> [10] utf8_1.2.2 rlang_1.0.4 pillar_1.8.0 #> [13] glue_1.6.2 withr_2.5.0 DBI_1.1.3 #> [16] dbplyr_2.2.1 readxl_1.4.0 modelr_0.1.8 #> [19] lifecycle_1.0.1 munsell_0.5.0 gtable_0.3.0 #> [22] cellranger_1.1.0 rvest_1.0.2 evaluate_0.15 #> [25] knitr_1.39 tzdb_0.3.0 fastmap_1.1.0 #> [28] fansi_1.0.3 highr_0.9 broom_1.0.0 #> [31] backports_1.4.1 scales_1.2.0 googlesheets4_1.0.0 #> [34] jsonlite_1.8.0 fs_1.5.2 hms_1.1.1 #> [37] digest_0.6.29 stringi_1.7.8 grid_4.0.5 #> [40] cli_3.3.0 tools_4.0.5 magrittr_2.0.3 #> [43] crayon_1.5.1 pkgconfig_2.0.3 ellipsis_0.3.2 #> [46] xml2_1.3.3 reprex_2.0.1 googledrive_2.0.0 #> [49] lubridate_1.8.0 assertthat_0.2.1 rmarkdown_2.14 #> [52] httr_1.4.3 rstudioapi_0.13 R6_2.5.1 #> [55] compiler_4.0.5 ```