tidymodels / recipes

Pipeable steps for feature engineering and data preprocessing to prepare for modeling
https://recipes.tidymodels.org
Other
571 stars 113 forks source link

uninformative error when `step_cut()` is applied to column with missing values #1351

Closed EmilHvitfeldt closed 2 weeks ago

EmilHvitfeldt commented 4 months ago

The error comes from create_full_breaks() and should properly be handled in here as well.

library(recipes)

recipe(~ ., data = mtcars) %>%
  step_cut(mpg, breaks = 20) %>%
  prep()
#> 
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#> 
#> ── Inputs
#> Number of variables by role
#> predictor: 11
#> 
#> ── Training information
#> Training data contained 32 data points and no incomplete rows.
#> 
#> ── Operations
#> • Cut numeric for: mpg | Trained

mtcars[1, 1] <- NA

recipe(~ ., data = mtcars) %>%
  step_cut(mpg, breaks = 20) %>%
  prep()
#> Error in `step_cut()`:
#> Caused by error in `if (min(var) < min(breaks)) ...`:
#> ! missing value where TRUE/FALSE needed

Created on 2024-07-19 with reprex v2.1.0

github-actions[bot] commented 5 days ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.