Closed elpidiofilho closed 6 years ago
Take a look at the warnings:
kernlab
class prediction calculations failed; returning NAs
SVMs don't naturally include probability estimates so there is a secondary model (that is basically a logistic model) that is fit. This extra model is suppose to translate the SVM output to probabilities and here it failed. This happens periodically and there isn't much that anything outside of kernab
can do to fix it.
Out of curiosity, why run loadNamespace
?
Max the above code works correctly if I add in my source code the call to the kernlab library. It seems to me that the caret is not loading the kernlab library before running the svmPoly classifier.
This code run ok with call to kernlab library in my code, but if i remove this line I get a error.
library(caret)
library(kernlab)
model <- "svmPoly"
set.seed(2)
training <- twoClassSim(50, linearVars = 2)
trainX <- training[, -ncol(training)]
trainY <- training$Class
cctrl1 <- trainControl(method = "cv", number = 3, returnResamp = "all")
set.seed(849)
test_class_cv_model <- train(trainX, trainY,
method = "svmPoly",
trControl = cctrl1,
preProc = c("center", "scale"))
caret is not loading the kernlab library before running the svmPoly classifier.
It used to load the library and that was fairly bad form. As of the last version, it loads the namespace instead and this avoids name collisions. You should not have to do that with the current devel or CRAN versions of caret
. I did not use it to get your code to run.
Keep in mind that the logistic model that I mentioned uses random numbers and, even by setting the seed, you might not be able to get reproducible failures or successes from ksvm
(at least that was the case the last time I answered this question)
This should be fine with the current devel. I fixed these issues today but it would be good to have someone do an external test on their system.
Max, I installed the current devel and the errors that I had reported (lasso and svmPoly) disappeared. I decided to run all the regression methods of the caret package to the dataset airquality. Below is a summary of the results obtained.
dataset : airquality
Original : 128 models regression
Removed slow models
[1] "ANFIS" "bartMachine" "DENFIS" "earth" "FIR.DM" "FS.HGD" "GFS.FR.MOGUL" "GFS.LT.RS"
[9] "GFS.THRIFT" "HYFIS"
[1] "failed models" 45
[1] "bag" "BstLm" "dnn" "gaussprRadial" "gbm_h2o"
[6] "glm.nb" "glmnet_h2o" "logicBag" "logreg" "mlpKerasDecay"
[11] "mlpKerasDropout" "mlpSGD" "msaenet" "mxnet" "mxnetAdam"
[16] "nnls" "null" "ordinalNet" "parRF" "penalized"
[21] "plsRglm" "pythonKnnReg" "qrf" "randomGLM" "ranger"
[26] "rbf" "Rborist" "rfRules" "rqlasso" "rqnc"
[31] "RRF" "RRFglobal" "SBC" "spikeslab" "svmBoundrangeString"
[36] "svmExpoString" "svmLinear2" "svmLinear3" "svmPoly" "svmSpectrumString"
[41] "treebag" "xgbDART" "xgbLinear" "xgbTree" "xyf"
Sucessfull run models = 73
Models that depends of library that isn't in CRAN
msaenet, mxnetAdam --> mxnet
pythonKnnReg --> rPython
Some of those are not included in the package because there are issues with the code (and the people who added them aren't responding, such as pythonKnnReg
). If you looked at what is in the models
directory, there are some that shouldn't be included for testing. Use the models that are available via getModelInfo
.
Also, there are some in there that make no sense to run for that data set (eg. svmExpoString
) and others that depend on external libraries (e.g. keras
, h2o
). Were those installed and verified to be working?
Also, I ran all of the regression tests yesterday after a big commit and they worked for those test cases.
For the svm models, let's start by getting reprexs (= {repr}oducible {ex}ample) for these test cases so that I can try to reproduce them. Also, I suggest using the sessioninfo
package to get version information since that gives a lot more detail (but that's not required).
As an aside, I've thought about removing all of the frbs
models because they are very slow and tend to consistently fail the basic regression tests.
for bag model
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(caret))
data("airquality")
d = airquality %>% na.omit() %>% select(-Month,-Day) %>% data.frame()
set.seed(313)
va = caret::createDataPartition(d[,1], p = 0.75, list = F)
train = d[va,]
test = d[-va,]
resample_ = 'cv'
nfolds = 5;
regressor = 'bag'
caret::getModelInfo(regressor, regex = F)[[1]]$type
#> [1] "Regression" "Classification"
tc <- trainControl( method = resample_, number = nfolds)
fit1 = caret::train(x = train[,-1], y = train[,1], method = regressor, metric = 'Rsquared', trControl = tc)
#> Warning: model fit failed for Fold1: vars=3 Error in bag.default(x, y, vars = param$vars, ...) :
#> Please specify 'bagControl' with the appropriate functions
#> Warning: model fit failed for Fold2: vars=3 Error in bag.default(x, y, vars = param$vars, ...) :
#> Please specify 'bagControl' with the appropriate functions
#> Warning: model fit failed for Fold3: vars=3 Error in bag.default(x, y, vars = param$vars, ...) :
#> Please specify 'bagControl' with the appropriate functions
#> Warning: model fit failed for Fold4: vars=3 Error in bag.default(x, y, vars = param$vars, ...) :
#> Please specify 'bagControl' with the appropriate functions
#> Warning: model fit failed for Fold5: vars=3 Error in bag.default(x, y, vars = param$vars, ...) :
#> Please specify 'bagControl' with the appropriate functions
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info =
#> trainInfo, : There were missing values in resampled performance measures.
#> Something is wrong; all the Rsquared metric values are missing:
#> RMSE Rsquared MAE
#> Min. : NA Min. : NA Min. : NA
#> 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
#> Median : NA Median : NA Median : NA
#> Mean :NaN Mean :NaN Mean :NaN
#> 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
#> Max. : NA Max. : NA Max. : NA
#> NA's :1 NA's :1 NA's :1
#> Error: Stopping
warnings()
#> NULL
fit1
#> Error in eval(expr, envir, enclos): objeto 'fit1' não encontrado
if (is.null(fit1) == FALSE) {
v = predict(fit1, test[,-1])
plot(v, test$Ozone)
abline(0,1)
caret::postResample((unlist(v)), test$Ozone)
}
#> Error in eval(expr, envir, enclos): objeto 'fit1' não encontrado
sessionInfo()
#> R version 3.4.2 (2017-09-28)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 15063)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
#> [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
#> [5] LC_TIME=Portuguese_Brazil.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] caret_6.0-77.9000 ggplot2_2.2.1 lattice_0.20-35 dplyr_0.7.4
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_0.12.13 lubridate_1.7.1 tidyr_0.7.2
#> [4] class_7.3-14 assertthat_0.2.0 rprojroot_1.2
#> [7] digest_0.6.12 ipred_0.9-6 psych_1.7.8
#> [10] foreach_1.4.3 R6_2.2.2 plyr_1.8.4
#> [13] backports_1.1.1 stats4_3.4.2 evaluate_0.10.1
#> [16] rlang_0.1.4 lazyeval_0.2.1 kernlab_0.9-25
#> [19] rpart_4.1-11 Matrix_1.2-11 rmarkdown_1.6
#> [22] splines_3.4.2 CVST_0.2-1 ddalpha_1.3.1
#> [25] gower_0.1.2 stringr_1.2.0 foreign_0.8-69
#> [28] munsell_0.4.3 broom_0.4.2 compiler_3.4.2
#> [31] pkgconfig_2.0.1 mnormt_1.5-5 dimRed_0.1.0
#> [34] htmltools_0.3.6 nnet_7.3-12 tidyselect_0.2.3
#> [37] tibble_1.3.4 prodlim_1.6.1 DRR_0.0.2
#> [40] codetools_0.2-15 RcppRoll_0.2.3 withr_2.1.0
#> [43] MASS_7.3-47 recipes_0.1.0.9000 ModelMetrics_1.1.0
#> [46] grid_3.4.2 nlme_3.1-131 gtable_0.2.0
#> [49] magrittr_1.5 scales_0.5.0 stringi_1.1.5
#> [52] reshape2_1.4.2 bindrcpp_0.2 timeDate_3012.100
#> [55] robustbase_0.92-8 lava_1.5.1 iterators_1.0.8
#> [58] tools_3.4.2 glue_1.2.0 DEoptimR_1.0-8
#> [61] purrr_0.2.4 sfsmisc_1.1-1 parallel_3.4.2
#> [64] survival_2.41-3 yaml_2.1.14 colorspace_1.3-2
#> [67] knitr_1.17 bindr_0.1
for BstLm model
Sys.setenv(LANG="EN")
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(caret))
data("airquality")
d = airquality %>% na.omit() %>% select(-Month,-Day) %>% data.frame()
set.seed(313)
va = caret::createDataPartition(d[,1], p = 0.75, list = F)
train = d[va,]
test = d[-va,]
resample_ = 'cv'
nfolds = 5;
regressor = "BstLm"
caret::getModelInfo(regressor, regex = F)[[1]]$type
#> [1] "Regression" "Classification"
tc <- trainControl( method = resample_, number = nfolds)
fit1 = caret::train(x = train[,-1], y = train[,1], method = regressor, metric = 'Rsquared', trControl = tc)
#> Error in .(nu): could not find function "."
warnings()
#> NULL
fit1
#> Error in eval(expr, envir, enclos): object 'fit1' not found
if (is.null(fit1) == FALSE) {
v = predict(fit1, test[,-1])
plot(v, test$Ozone)
abline(0,1)
caret::postResample((unlist(v)), test$Ozone)
}
#> Error in eval(expr, envir, enclos): object 'fit1' not found
sessionInfo()
#> R version 3.4.2 (2017-09-28)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 15063)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
#> [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
#> [5] LC_TIME=Portuguese_Brazil.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] caret_6.0-77.9000 ggplot2_2.2.1 lattice_0.20-35 dplyr_0.7.4
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_0.12.13 lubridate_1.7.1 tidyr_0.7.2
#> [4] class_7.3-14 assertthat_0.2.0 rprojroot_1.2
#> [7] digest_0.6.12 ipred_0.9-6 psych_1.7.8
#> [10] foreach_1.4.3 R6_2.2.2 plyr_1.8.4
#> [13] backports_1.1.1 stats4_3.4.2 evaluate_0.10.1
#> [16] rlang_0.1.4 lazyeval_0.2.1 kernlab_0.9-25
#> [19] rpart_4.1-11 Matrix_1.2-11 rmarkdown_1.6
#> [22] splines_3.4.2 CVST_0.2-1 ddalpha_1.3.1
#> [25] gower_0.1.2 stringr_1.2.0 foreign_0.8-69
#> [28] munsell_0.4.3 broom_0.4.2 compiler_3.4.2
#> [31] pkgconfig_2.0.1 mnormt_1.5-5 dimRed_0.1.0
#> [34] gbm_2.1.3 htmltools_0.3.6 nnet_7.3-12
#> [37] tidyselect_0.2.3 tibble_1.3.4 prodlim_1.6.1
#> [40] DRR_0.0.2 codetools_0.2-15 RcppRoll_0.2.3
#> [43] withr_2.1.0 MASS_7.3-47 recipes_0.1.0.9000
#> [46] ModelMetrics_1.1.0 grid_3.4.2 nlme_3.1-131
#> [49] gtable_0.2.0 magrittr_1.5 scales_0.5.0
#> [52] stringi_1.1.5 reshape2_1.4.2 doParallel_1.0.11
#> [55] bst_0.3-14 bindrcpp_0.2 timeDate_3012.100
#> [58] robustbase_0.92-8 lava_1.5.1 iterators_1.0.8
#> [61] tools_3.4.2 glue_1.2.0 DEoptimR_1.0-8
#> [64] purrr_0.2.4 sfsmisc_1.1-1 parallel_3.4.2
#> [67] survival_2.41-3 yaml_2.1.14 colorspace_1.3-2
#> [70] knitr_1.17 bindr_0.1
for gaussprRadial model
Sys.setenv(LANG="EN")
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(caret))
data("airquality")
d = airquality %>% na.omit() %>% select(-Month,-Day) %>% data.frame()
set.seed(313)
va = caret::createDataPartition(d[,1], p = 0.75, list = F)
train = d[va,]
test = d[-va,]
resample_ = 'cv'
nfolds = 5;
regressor = "gaussprRadial"
caret::getModelInfo(regressor, regex = F)[[1]]$type
#> [1] "Regression" "Classification"
tc <- trainControl( method = resample_, number = nfolds)
fit1 = caret::train(x = train[,-1], y = train[,1], method = regressor, metric = 'Rsquared', trControl = tc)
#> Warning: predictions failed for Fold1: sigma=0.3993 Error in UseMethod("predict") :
#> no applicable method for 'predict' applied to an object of class "c('gausspr', 'vm')"
#> Warning: predictions failed for Fold2: sigma=0.3993 Error in UseMethod("predict") :
#> no applicable method for 'predict' applied to an object of class "c('gausspr', 'vm')"
#> Warning: predictions failed for Fold3: sigma=0.3993 Error in UseMethod("predict") :
#> no applicable method for 'predict' applied to an object of class "c('gausspr', 'vm')"
#> Warning: predictions failed for Fold4: sigma=0.3993 Error in UseMethod("predict") :
#> no applicable method for 'predict' applied to an object of class "c('gausspr', 'vm')"
#> Warning: predictions failed for Fold5: sigma=0.3993 Error in UseMethod("predict") :
#> no applicable method for 'predict' applied to an object of class "c('gausspr', 'vm')"
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info =
#> trainInfo, : There were missing values in resampled performance measures.
#> Something is wrong; all the Rsquared metric values are missing:
#> RMSE Rsquared MAE
#> Min. : NA Min. : NA Min. : NA
#> 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
#> Median : NA Median : NA Median : NA
#> Mean :NaN Mean :NaN Mean :NaN
#> 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
#> Max. : NA Max. : NA Max. : NA
#> NA's :1 NA's :1 NA's :1
#> Error: Stopping
warnings()
#> NULL
fit1
#> Error in eval(expr, envir, enclos): object 'fit1' not found
if (is.null(fit1) == FALSE) {
v = predict(fit1, test[,-1])
plot(v, test$Ozone)
abline(0,1)
caret::postResample((unlist(v)), test$Ozone)
}
#> Error in eval(expr, envir, enclos): object 'fit1' not found
sessionInfo()
#> R version 3.4.2 (2017-09-28)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 15063)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
#> [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
#> [5] LC_TIME=Portuguese_Brazil.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] caret_6.0-77.9000 ggplot2_2.2.1 lattice_0.20-35 dplyr_0.7.4
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_0.12.13 lubridate_1.7.1 tidyr_0.7.2
#> [4] class_7.3-14 assertthat_0.2.0 rprojroot_1.2
#> [7] digest_0.6.12 ipred_0.9-6 psych_1.7.8
#> [10] foreach_1.4.3 R6_2.2.2 plyr_1.8.4
#> [13] backports_1.1.1 stats4_3.4.2 evaluate_0.10.1
#> [16] rlang_0.1.4 lazyeval_0.2.1 kernlab_0.9-25
#> [19] rpart_4.1-11 Matrix_1.2-11 rmarkdown_1.6
#> [22] splines_3.4.2 CVST_0.2-1 ddalpha_1.3.1
#> [25] gower_0.1.2 stringr_1.2.0 foreign_0.8-69
#> [28] munsell_0.4.3 broom_0.4.2 compiler_3.4.2
#> [31] pkgconfig_2.0.1 mnormt_1.5-5 dimRed_0.1.0
#> [34] htmltools_0.3.6 nnet_7.3-12 tidyselect_0.2.3
#> [37] tibble_1.3.4 prodlim_1.6.1 DRR_0.0.2
#> [40] codetools_0.2-15 RcppRoll_0.2.3 withr_2.1.0
#> [43] MASS_7.3-47 recipes_0.1.0.9000 ModelMetrics_1.1.0
#> [46] grid_3.4.2 nlme_3.1-131 gtable_0.2.0
#> [49] magrittr_1.5 scales_0.5.0 stringi_1.1.5
#> [52] reshape2_1.4.2 bindrcpp_0.2 timeDate_3012.100
#> [55] robustbase_0.92-8 lava_1.5.1 iterators_1.0.8
#> [58] tools_3.4.2 glue_1.2.0 DEoptimR_1.0-8
#> [61] purrr_0.2.4 sfsmisc_1.1-1 parallel_3.4.2
#> [64] survival_2.41-3 yaml_2.1.14 colorspace_1.3-2
#> [67] knitr_1.17 bindr_0.1
for glm.nb model
Sys.setenv(LANG="EN")
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(caret))
data("airquality")
d = airquality %>% na.omit() %>% select(-Month,-Day) %>% data.frame()
set.seed(313)
va = caret::createDataPartition(d[,1], p = 0.75, list = F)
train = d[va,]
test = d[-va,]
resample_ = 'cv'
nfolds = 5;
regressor = "glm.nb"
caret::getModelInfo(regressor, regex = F)[[1]]$type
#> [1] "Regression"
tc <- trainControl( method = resample_, number = nfolds)
fit1 = caret::train(x = train[,-1], y = train[,1], method = regressor, metric = 'Rsquared', trControl = tc)
#> Warning: model fit failed for Fold1: link=log Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold1: link=sqrt Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold1: link=identity Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold2: link=log Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold2: link=sqrt Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold2: link=identity Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold3: link=log Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold3: link=sqrt Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold3: link=identity Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold4: link=log Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold4: link=sqrt Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold4: link=identity Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(190L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold5: link=log Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(299L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold5: link=sqrt Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(299L, :
#> could not find function "glm.nb"
#> Warning: model fit failed for Fold5: link=identity Error in glm.nb(formula = .outcome ~ ., data = structure(list(Solar.R = c(299L, :
#> could not find function "glm.nb"
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info =
#> trainInfo, : There were missing values in resampled performance measures.
#> Something is wrong; all the Rsquared metric values are missing:
#> RMSE Rsquared MAE
#> Min. : NA Min. : NA Min. : NA
#> 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
#> Median : NA Median : NA Median : NA
#> Mean :NaN Mean :NaN Mean :NaN
#> 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
#> Max. : NA Max. : NA Max. : NA
#> NA's :3 NA's :3 NA's :3
#> Error: Stopping
warnings()
#> NULL
fit1
#> Error in eval(expr, envir, enclos): object 'fit1' not found
if (is.null(fit1) == FALSE) {
v = predict(fit1, test[,-1])
plot(v, test$Ozone)
abline(0,1)
caret::postResample((unlist(v)), test$Ozone)
}
#> Error in eval(expr, envir, enclos): object 'fit1' not found
sessionInfo()
#> R version 3.4.2 (2017-09-28)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 15063)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
#> [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
#> [5] LC_TIME=Portuguese_Brazil.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] caret_6.0-77.9000 ggplot2_2.2.1 lattice_0.20-35 dplyr_0.7.4
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_0.12.13 lubridate_1.7.1 tidyr_0.7.2
#> [4] class_7.3-14 assertthat_0.2.0 rprojroot_1.2
#> [7] digest_0.6.12 ipred_0.9-6 psych_1.7.8
#> [10] foreach_1.4.3 R6_2.2.2 plyr_1.8.4
#> [13] backports_1.1.1 stats4_3.4.2 evaluate_0.10.1
#> [16] rlang_0.1.4 lazyeval_0.2.1 kernlab_0.9-25
#> [19] rpart_4.1-11 Matrix_1.2-11 rmarkdown_1.6
#> [22] splines_3.4.2 CVST_0.2-1 ddalpha_1.3.1
#> [25] gower_0.1.2 stringr_1.2.0 foreign_0.8-69
#> [28] munsell_0.4.3 broom_0.4.2 compiler_3.4.2
#> [31] pkgconfig_2.0.1 mnormt_1.5-5 dimRed_0.1.0
#> [34] htmltools_0.3.6 nnet_7.3-12 tidyselect_0.2.3
#> [37] tibble_1.3.4 prodlim_1.6.1 DRR_0.0.2
#> [40] codetools_0.2-15 RcppRoll_0.2.3 withr_2.1.0
#> [43] MASS_7.3-47 recipes_0.1.0.9000 ModelMetrics_1.1.0
#> [46] grid_3.4.2 nlme_3.1-131 gtable_0.2.0
#> [49] magrittr_1.5 scales_0.5.0 stringi_1.1.5
#> [52] reshape2_1.4.2 bindrcpp_0.2 timeDate_3012.100
#> [55] robustbase_0.92-8 lava_1.5.1 iterators_1.0.8
#> [58] tools_3.4.2 glue_1.2.0 DEoptimR_1.0-8
#> [61] purrr_0.2.4 sfsmisc_1.1-1 parallel_3.4.2
#> [64] survival_2.41-3 yaml_2.1.14 colorspace_1.3-2
#> [67] knitr_1.17 bindr_0.1
I've updated a few models (and their tests). In some cases, the data packages loaded the library in the tests, so the error was not caught.
However, on the bagging model, please not the error message: Please specify 'bagControl' with the appropriate functions. This is not an issue with caret
.
I also tested a lot of models on your list of 45 and the vast majority of them were false positives.
For null model
Sys.setenv(LANG="EN")
suppressPackageStartupMessages(library(dplyr))
#> Warning: package 'dplyr' was built under R version 3.4.2
suppressPackageStartupMessages(library(caret))
data("airquality")
d = airquality %>% na.omit() %>% select(-Month,-Day) %>% data.frame()
set.seed(313)
va = caret::createDataPartition(d[,1], p = 0.75, list = F)
train = d[va,]
test = d[-va,]
resample_ = 'cv'
nfolds = 5;
regressor = "null"
caret::getModelInfo(regressor, regex = F)[[1]]$type
#> [1] "Classification" "Regression"
tc <- trainControl( method = resample_, number = nfolds)
fit1 = caret::train(x = train[,-1], y = train[,1], method = regressor, metric = 'Rsquared', trControl = tc)
#> Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info =
#> trainInfo, : There were missing values in resampled performance measures.
#> Something is wrong; all the Rsquared metric values are missing:
#> RMSE Rsquared MAE
#> Min. :31.55 Min. : NA Min. :25.16
#> 1st Qu.:31.55 1st Qu.: NA 1st Qu.:25.16
#> Median :31.55 Median : NA Median :25.16
#> Mean :31.55 Mean :NaN Mean :25.16
#> 3rd Qu.:31.55 3rd Qu.: NA 3rd Qu.:25.16
#> Max. :31.55 Max. : NA Max. :25.16
#> NA's :1
#> Error: Stopping
warnings()
#> NULL
fit1
#> Error in eval(expr, envir, enclos): object 'fit1' not found
if (is.null(fit1) == FALSE) {
v = predict(fit1, test[,-1])
plot(v, test$Ozone)
abline(0,1)
caret::postResample((unlist(v)), test$Ozone)
}
#> Error in eval(expr, envir, enclos): object 'fit1' not found
sessionInfo()
#> R version 3.4.1 (2017-06-30)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 15063)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
#> [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
#> [5] LC_TIME=Portuguese_Brazil.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] caret_6.0-77.9000 ggplot2_2.2.1 lattice_0.20-35 dplyr_0.7.4
#>
#> loaded via a namespace (and not attached):
#> [1] purrr_0.2.4 reshape2_1.4.2 kernlab_0.9-25
#> [4] splines_3.4.1 colorspace_1.3-2 stats4_3.4.1
#> [7] htmltools_0.3.6 yaml_2.1.14 survival_2.41-3
#> [10] prodlim_1.6.1 rlang_0.1.4 ModelMetrics_1.1.0
#> [13] withr_2.1.0 glue_1.2.0 bindrcpp_0.2
#> [16] foreach_1.4.3 bindr_0.1 plyr_1.8.4
#> [19] dimRed_0.1.0 lava_1.5.1 robustbase_0.92-8
#> [22] stringr_1.2.0 timeDate_3012.100 munsell_0.4.3
#> [25] gtable_0.2.0 recipes_0.1.0 codetools_0.2-15
#> [28] evaluate_0.10.1 knitr_1.17 class_7.3-14
#> [31] DEoptimR_1.0-8 Rcpp_0.12.13 scales_0.5.0
#> [34] backports_1.1.1 ipred_0.9-6 CVST_0.2-1
#> [37] digest_0.6.12 stringi_1.1.5 RcppRoll_0.2.2
#> [40] ddalpha_1.3.1 grid_3.4.1 rprojroot_1.2
#> [43] tools_3.4.1 magrittr_1.5 lazyeval_0.2.1
#> [46] tibble_1.3.4 DRR_0.0.2 pkgconfig_2.0.1
#> [49] MASS_7.3-47 Matrix_1.2-11 lubridate_1.7.1
#> [52] gower_0.1.2 assertthat_0.2.0 rmarkdown_1.6.0.9000
#> [55] iterators_1.0.8 R6_2.2.2 rpart_4.1-11
#> [58] sfsmisc_1.1-1 nnet_7.3-12 nlme_3.1-131
#> [61] compiler_3.4.1
This is a false positive. For regression, it predicts the mean so there is no variation to calculate $R^2$. You should get an error here.
The caret (github version 6.0-77) displays an error message when I try to fit a svmPoly model .
Something is wrong; all the Accuracy metric values are missing: Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :27 NA's :27
Error: Stopping In addition: There were 50 or more warnings (use warnings() to see the first 50)
I try with code in caret test link: https://github.com/topepo/caret/blob/master/RegressionTests/Code/svmPoly.R and I give the same error message.
Code :