Closed grouptheory closed 3 months ago
cross-posted from https://stackoverflow.com/questions/78834269/tidymodels-step-corr-fails-to-remove-highly-correlated-columns
You forgot to selector variables in step_corr()
. All steps allow for empty selections which does nothing
library(recipes)
df <- data.frame(x1=runif(10)) %>%
mutate(x2=x1+1) %>%
mutate(y=x1+rnorm(10))
cor(df)
#> x1 x2 y
#> x1 1.0000000 1.0000000 0.6882089
#> x2 1.0000000 1.0000000 0.6882089
#> y 0.6882089 0.6882089 1.0000000
rec <- recipe(y~x1+x2, data = df) %>%
step_corr(all_predictors(), threshold=0.9) %>%
prep(df)
bake(rec, new_data=df)
#> # A tibble: 10 × 2
#> x2 y
#> <dbl> <dbl>
#> 1 1.06 -0.353
#> 2 1.53 -0.951
#> 3 1.87 2.51
#> 4 1.43 -0.288
#> 5 1.60 0.696
#> 6 1.64 0.296
#> 7 1.31 1.16
#> 8 1.07 -1.37
#> 9 1.49 -0.215
#> 10 1.70 1.16
Created on 2024-08-05 with reprex v2.1.0
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.
Minimal example: