tidymodels / hardhat

Construct Modeling Packages
https://hardhat.tidymodels.org
Other
101 stars 16 forks source link

Issue in hardhat::mold(): Naming of factor and numeric variables #182

Closed MasterLuke84 closed 2 years ago

MasterLuke84 commented 2 years ago

Hello,

I have encounterd a problem during tuning with the ranger-engine. This following error occurred in every resampling step: Error: Functions involving factors or characters have been detected on the RHS of formula. These are not allowed when indicators = "none". Functions involving factors were detected for the following columns:

I think I could trace-back it to the mold-function. If no dummy-coding is preferred (indicators = "none"), this error occurs, when a factor variable is present (within the formula) and a second numeric - variable (within the formula) contains the FULL name of the first factor-variable in its own name.

I used the penguin data from your mold-vignette for the reproducible example.

Reproducible example

library(tidyverse)
library(hardhat)
library(modeldata)

data("penguins")
# species    : FACTOR variable
# xxspeciesxx: NUMERIC variable contains "species" within its own name (FULL name of factor variable)
# xxspecIesxx: NUMERIC variable, same as xxspeciesxx, changed small i to capital I
penguins <- na.omit(penguins) %>% 
  dplyr::mutate(xxspeciesxx = as.numeric(species),
                xxspecIesxx = as.numeric(species))

#  xxspeciesxx: ERROR (contains FULL name of factor variable)
mold(
  ~ body_mass_g + species + xxspeciesxx, 
  penguins, 
  blueprint = default_formula_blueprint(indicators = "none")
)

# xxspecIesxx: WORKS FINE (does NOT contain FULL name of factor variable, because of the capital I)
mold(
  ~ body_mass_g + species + xxspecIesxx, 
  penguins, 
  blueprint = default_formula_blueprint(indicators = "none")
)

A more real-life (naming)-example, where the tuning would fail, might be:

Best regards

github-actions[bot] commented 2 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.