EmilHvitfeldt / extrasteps

More Steps for the 'recipes' Package
https://emilhvitfeldt.github.io/extrasteps/
Other
6 stars 1 forks source link

add step_target_knn #4

Open EmilHvitfeldt opened 3 years ago

EmilHvitfeldt commented 3 years ago

This step will take a target variable and any number of other predictors. It will train a knn model using target ~ predictors and the bake will use the predictors to predict the target which will be the output.

This could be useful for geo data and house prices

library(tidymodels)
data("Sacramento", package = "modeldata")

set.seed(1234)
Sacramento_split <- initial_split(Sacramento)

Sacramento_train <- training(Sacramento_split)
Sacramento_test <- testing(Sacramento_split)

knn <- kknn::train.kknn(price ~ latitude + longitude, data = Sacramento_train)

Sacramento_test %>%
  mutate(new = predict(knn, newdata = Sacramento_test))
#> # A tibble: 233 × 10
#>    city            zip    beds baths  sqft type   price latitude longitude    new
#>    <fct>           <fct> <int> <dbl> <int> <fct>  <int>    <dbl>     <dbl>  <dbl>
#>  1 SACRAMENTO      z958…     2     1   797 Resi…  81900     38.5     -121. 1.27e5
#>  2 SACRAMENTO      z958…     3     1  1177 Resi…  91002     38.5     -121. 1.46e5
#>  3 RANCHO_CORDOVA  z956…     2     2   941 Condo  94905     38.6     -121. 3.50e5
#>  4 SACRAMENTO      z958…     2     2  1134 Condo 110700     38.7     -121. 2.51e5
#>  5 CITRUS_HEIGHTS  z956…     2     1   795 Condo 116250     38.7     -121. 1.99e5
#>  6 RIO_LINDA       z956…     3     2  1356 Resi… 121630     38.7     -121. 1.49e5
#>  7 ANTELOPE        z958…     3     2  1088 Resi… 126640     38.7     -121. 1.63e5
#>  8 SACRAMENTO      z958…     3     2  1380 Resi… 136500     38.6     -121. 1.90e5
#>  9 ANTELOPE        z958…     2     2  1043 Resi… 161250     38.7     -121. 2.20e5
#> 10 NORTH_HIGHLANDS z956…     4     2  1587 Resi… 161500     38.7     -121. 1.59e5
#> # … with 223 more rows

Created on 2021-08-10 by the reprex package (v2.0.1)