I have encounterd a problem during tuning with the ranger-engine. This following error occurred in every resampling step:
Error: Functions involving factors or characters have been detected on the RHS of formula. These are not allowed when indicators = "none". Functions involving factors were detected for the following columns:
I think I could trace-back it to the mold-function. If no dummy-coding is preferred (indicators = "none"), this error occurs, when a factor variable is present (within the formula) and a second numeric - variable (within the formula) contains the FULL name of the first factor-variable in its own name.
I used the penguin data from your mold-vignette for the reproducible example.
Reproducible example
library(tidyverse)
library(hardhat)
library(modeldata)
data("penguins")
# species : FACTOR variable
# xxspeciesxx: NUMERIC variable contains "species" within its own name (FULL name of factor variable)
# xxspecIesxx: NUMERIC variable, same as xxspeciesxx, changed small i to capital I
penguins <- na.omit(penguins) %>%
dplyr::mutate(xxspeciesxx = as.numeric(species),
xxspecIesxx = as.numeric(species))
# xxspeciesxx: ERROR (contains FULL name of factor variable)
mold(
~ body_mass_g + species + xxspeciesxx,
penguins,
blueprint = default_formula_blueprint(indicators = "none")
)
# xxspecIesxx: WORKS FINE (does NOT contain FULL name of factor variable, because of the capital I)
mold(
~ body_mass_g + species + xxspecIesxx,
penguins,
blueprint = default_formula_blueprint(indicators = "none")
)
A more real-life (naming)-example, where the tuning would fail, might be:
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Hello,
I have encounterd a problem during tuning with the
ranger
-engine. This following error occurred in every resampling step:Error: Functions involving factors or characters have been detected on the RHS of formula. These are not allowed when indicators = "none". Functions involving factors were detected for the following columns:
I think I could trace-back it to the
mold
-function. If no dummy-coding is preferred (indicators = "none"
), this error occurs, when a factor variable is present (within the formula) and a second numeric - variable (within the formula) contains the FULL name of the first factor-variable in its own name.I used the penguin data from your mold-vignette for the reproducible example.
Reproducible example
A more real-life (naming)-example, where the tuning would fail, might be:
Best regards