I'm using the caret::train() function with K-Fold cross-validation (example given below). There is a dataset of 1470 rows and 40 columns. one target variable and 38 X variables.
execution time of more than 50 second-
train_control <- trainControl(method = "cv", number = 5)
system.time(
model_lm <- train(YearsAtCompany~. -EmployeeNumber,
data = hrdatanew,
methods = "lm",
trControl = train_control)
)
# user system elapsed
# 50.107 0.319 50.406
The output of the model is showing Random Forrest but provided method is "lm"
model_lm
# Random Forest
# 1470 samples
# 39 predictor
# No pre-processing
# Resampling: Cross-Validated (5 fold)
# Summary of sample sizes: 1175, 1177, 1175, 1176, 1177
# Resampling results across tuning parameters:
# mtry RMSE Rsquared MAE
# 2 3.383678 0.7809184 2.190747
# 20 2.246496 0.8668412 1.178069
# 38 2.237824 0.8671793 1.192514
# RMSE was used to select the optimal model using the smallest value.
# The final value used for the model was mtry = 38.
Almost the same execution time was observed without K-Fold as well
I'm using Macbook Pro Max 64GB/32 Core GPU
Please find below session information-
Session Info:
>sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.3.1
Hello,
I'm using the caret::train() function with K-Fold cross-validation (example given below). There is a dataset of 1470 rows and 40 columns. one target variable and 38 X variables.
execution time of more than 50 second-
The output of the model is showing Random Forrest but provided method is "lm"
Almost the same execution time was observed without K-Fold as well
I'm using Macbook Pro Max 64GB/32 Core GPU
Please find below session information-
Session Info:
R version 4.1.3 (2022-03-10) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.3.1
Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] caret_6.0-92 lattice_0.20-45 Metrics_0.1.4 car_3.0-12 carData_3.0-5 rsample_0.1.1
[7] fastDummies_1.6.3 gridExtra_2.3 ggplot2_3.3.5 readxl_1.4.0
loaded via a namespace (and not attached): [1] Rcpp_1.0.8.3 lubridate_1.8.0 tidyr_1.2.0 listenv_0.8.0 class_7.3-20
[6] assertthat_0.2.1 digest_0.6.29 ipred_0.9-12 foreach_1.5.2 utf8_1.2.2
[11] parallelly_1.31.0 R6_2.5.1 cellranger_1.1.0 plyr_1.8.7 hardhat_0.2.0
[16] stats4_4.1.3 evaluate_0.15 pillar_1.7.0 rlang_1.0.2 rstudioapi_0.13
[21] data.table_1.14.2 furrr_0.2.3 rpart_4.1.16 Matrix_1.4-1 rmarkdown_2.13
[26] labeling_0.4.2 splines_4.1.3 gower_1.0.0 stringr_1.4.0 munsell_0.5.0
[31] compiler_4.1.3 xfun_0.30 pkgconfig_2.0.3 globals_0.14.0 htmltools_0.5.2
[36] nnet_7.3-17 tidyselect_1.1.2 tibble_3.1.6 prodlim_2019.11.13 codetools_0.2-18
[41] randomForest_4.7-1 fansi_1.0.3 future_1.24.0 crayon_1.5.1 dplyr_1.0.8
[46] withr_2.5.0 MASS_7.3-56 recipes_0.2.0 ModelMetrics_1.2.2.2 grid_4.1.3
[51] nlme_3.1-157 gtable_0.3.0 lifecycle_1.0.1 DBI_1.1.2 magrittr_2.0.3
[56] pROC_1.18.0 scales_1.2.0 future.apply_1.8.1 cli_3.2.0 stringi_1.7.6
[61] farver_2.1.0 reshape2_1.4.4 timeDate_3043.102 ellipsis_0.3.2 generics_0.1.2
[66] vctrs_0.4.1 lava_1.6.10 iterators_1.0.14 tools_4.1.3 glue_1.6.2
[71] purrr_0.3.4 abind_1.4-5 parallel_4.1.3 fastmap_1.1.0 survival_3.3-1
[76] yaml_2.3.5 colorspace_2.0-3 knitr_1.38
Modify Chunk OptionsRun All Chunks AboveRun Current Chunk You can delete the text in each section that explains how to do it correctly. Be sure to test your 2 chunks of code in an empty R session before submitting your issue!