Closed msevi closed 4 years ago
Update:
I've downloaded the corresponding RDS file for car_train
, and running the code exactly as in Section 11 of Chapter 1, I still get the error:
formula: Error: Functions involving factors or characters have been detected on the RHS of `formula`. These are not allowed when `indicators = "none"`. Functions involving factors were detected for the following columns: 'Lockup Torque Converter', 'Recommended Fuel', 'Fuel injection'.
--
You are correct that this is due to some changes in tidymodels, particularly how parsnip handles the predictor encodings and model.matrix()
business under the hood. A goal I have with this first chapter is to not introduce too many things at once, so I may need to change up a few aspects of how the data is stored to reduce this tension.
There are two things going on here.
janitor::clean_names()
to make all the column names nicer.I need to update the chapter for all of this but in the meantime, you can do:
car_vars <- cars2018 %>%
select(-Model, -`Model Index`) %>%
janitor::clean_names() %>%
mutate_if(is.character, factor)
or with across()
like you showed.
When I do this, both predict()
and fit_resamples()
works.
rf_mod %>%
fit_resamples(
log(mpg) ~ .,
car_boot,
control = control_resamples(save_pred = TRUE)
)
# Resampling results
# Bootstrap sampling
# A tibble: 25 x 5
splits id .metrics .notes .predictions
<list> <chr> <list> <list> <list>
1 <split [917/343… Bootstrap01 <tibble [2 × 3… <tibble [0 × … <tibble [343 × 3…
2 <split [917/330… Bootstrap02 <tibble [2 × 3… <tibble [0 × … <tibble [330 × 3…
3 <split [917/346… Bootstrap03 <tibble [2 × 3… <tibble [0 × … <tibble [346 × 3…
4 <split [917/335… Bootstrap04 <tibble [2 × 3… <tibble [0 × … <tibble [335 × 3…
5 <split [917/345… Bootstrap05 <tibble [2 × 3… <tibble [0 × … <tibble [345 × 3…
6 <split [917/351… Bootstrap06 <tibble [2 × 3… <tibble [0 × … <tibble [351 × 3…
7 <split [917/342… Bootstrap07 <tibble [2 × 3… <tibble [0 × … <tibble [342 × 3…
8 <split [917/322… Bootstrap08 <tibble [2 × 3… <tibble [0 × … <tibble [322 × 3…
9 <split [917/330… Bootstrap09 <tibble [2 × 3… <tibble [0 × … <tibble [330 × 3…
10 <split [917/342… Bootstrap10 <tibble [2 × 3… <tibble [0 × … <tibble [342 × 3…
# … with 15 more rows
Thanks for the report! 🙌
Awesome! Thank you so much. Confirming that it works :)
I encounter the same error as what was mentioned. Initially, I have thought of fixing car_train as below will work:
car_train <- car_train %>% mutate_if(is.character,as.factor)
as the affected code is with car_train on the RF portion:
results <- car_train %>% mutate(mpg = log(mpg)) %>% bind_cols(predict(fit_lm, car_train) %>% rename(.pred_lm = .pred)) %>% bind_cols(predict(fit_rf, car_train) %>% rename(.pred_rf = .pred))
But above does not solve the error. Why?
But your solution works below:
car_vars <- cars2018 %>% select(-Model, -
Model Index) %>% janitor::clean_names() %>% mutate_if(is.character, factor)
The reason that just converting from character to factor doesn't solve the problem is that some of the column names have spaces in them, which does not play well with the internals of, I think, the randomForest package. We can fix this by using janitor::clean_names()
to make all the column names nicer.
Hello! I'm going over Chapter 1, In section 8 & 9:
Produces:
This was solved prior to splitting data by
However, in section 11
produces
x Bootstrap01: formula: Error: Functions involving factors or characters have been detected on the RHS of formula. These are not allowed when indicators = "none". Functions involving factors were detected for the following columns: 'Lockup Torque Converter', 'Recommended Fuel', 'Fuel injection'.
I did notice that the Tidymodels version for the course is 0.1.0 and mine is 0.1.1 Is it just a version issue or do you have any advice on how to solve the previous error message?
Regards, Maria