juliasilge / supervised-ML-case-studies-course

Supervised machine learning case studies in R! 💫 A free interactive tidymodels course
https://supervised-ml-course.netlify.app/
MIT License
221 stars 76 forks source link

Chapter 3.14.2 R Session Aborted when running locally #63

Closed gabegarcia15 closed 3 years ago

gabegarcia15 commented 3 years ago

I have the following code, which matches the course solution:

library(tidymodels)
library(themis)

vote_train <- readRDS("data/c3_train_10_percent.rds")

vote_folds <- vfold_cv(vote_train, v = 10)

vote_recipe <- recipe(turnout16_2016 ~ ., data = vote_train) %>% 
    step_upsample(turnout16_2016)

rf_spec <- rand_forest() %>%
    set_engine("ranger") %>%
    set_mode("classification")

vote_wf <- workflow() %>%
    add_recipe(vote_recipe) %>%
    add_model(rf_spec)

set.seed(234)
rf_res <- vote_wf %>%
    fit_resamples(
        vote_folds,
        metrics = metric_set(roc_auc, sens, spec),
        control = control_resamples(save_pred = TRUE)
    )

glimpse(rf_res)

When I run the following code chunk locally, it results in R Session Aborted.

rf_res <- vote_wf %>%
    fit_resamples(
        vote_folds,
        metrics = metric_set(roc_auc, sens, spec),
        control = control_resamples(save_pred = TRUE)
    )

I'm not sure if it is related to the ranger library, but I did not get any errors when running 3.14.1 (i.e. logistic regression) and when I updated set_engine("ranger") to set_engine("randomForest") for 3.14.2.

I can also confirm that package versions in my local align with what is present in the course.

juliasilge commented 3 years ago

Hmmmm, I can't reproduce this, unfortunately. When I run this locally I do see one warning related to using such a small dataset and not having any true events, but the tuning does finish:

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
library(themis)
#> Registered S3 methods overwritten by 'themis':
#>   method                  from   
#>   bake.step_downsample    recipes
#>   bake.step_upsample      recipes
#>   prep.step_downsample    recipes
#>   prep.step_upsample      recipes
#>   tidy.step_downsample    recipes
#>   tidy.step_upsample      recipes
#>   tunable.step_downsample recipes
#>   tunable.step_upsample   recipes
#> 
#> Attaching package: 'themis'
#> The following objects are masked from 'package:recipes':
#> 
#>     step_downsample, step_upsample

vote_train <- readRDS("data/c3_train_10_percent.rds")

vote_folds <- vfold_cv(vote_train, v = 10)

vote_recipe <- recipe(turnout16_2016 ~ ., data = vote_train) %>% 
    step_upsample(turnout16_2016)

rf_spec <- rand_forest() %>%
    set_engine("ranger") %>%
    set_mode("classification")

vote_wf <- workflow() %>%
    add_recipe(vote_recipe) %>%
    add_model(rf_spec)

set.seed(234)
rf_res <- vote_wf %>%
    fit_resamples(
        vote_folds,
        metrics = metric_set(roc_auc, sens, spec),
        control = control_resamples(save_pred = TRUE)
    )

glimpse(rf_res)
#> Rows: 10
#> Columns: 5
#> $ splits       <list> [<vfold_split[481 x 54 x 535 x 42]>], [<vfold_split[481 …
#> $ id           <chr> "Fold01", "Fold02", "Fold03", "Fold04", "Fold05", "Fold06…
#> $ .metrics     <list> [<tbl_df[3 x 4]>], [<tbl_df[3 x 4]>], [<tbl_df[3 x 4]>],…
#> $ .notes       <list> [<tbl_df[0 x 1]>], [<tbl_df[0 x 1]>], [<tbl_df[0 x 1]>],…
#> $ .predictions <list> [<tbl_df[54 x 6]>], [<tbl_df[54 x 6]>], [<tbl_df[54 x 6]…

Created on 2021-08-02 by the reprex package (v2.0.0)

Could you try running this via reprex perhaps to see if something in your environment is causing the problem?

juliasilge commented 3 years ago

Let me know if you are able to reproduce this problem using a reprex!