Closed brancengregory closed 3 months ago
Replaced the main string detection code as follows:
apply_regex_pattern <- function(data, flag, regex_pattern) {
data |>
# Old {stringr} version:
# dplyr::mutate(
# !!flag := stringr::str_detect(!!dplyr::sym(col_to_clean),
# stringr::regex(regex_pattern, ignore_case = TRUE))
# )
# New {stringi} version:
dplyr::mutate(
!!flag := stringi::stri_detect(str = !!dplyr::sym(col_to_clean),
regex = paste0("(?i)", regex_pattern)) # Case insensitive
)
}
It doesn't really seem to save a ton of time but the results are exactly the same (checked with setdiff()
).
Benchmarks show a decrease in run time by half for string detection We also get a small performance boost by specifying case insensitivity in the regex patter (?i)