topepo / caret

caret (Classification And Regression Training) R package that contains misc functions for training and plotting classification and regression models
http://topepo.github.io/caret/index.html
1.61k stars 636 forks source link

Training error in rvmLinear model with using less-dimension input data. #871

Closed mertyagli closed 6 years ago

mertyagli commented 6 years ago

Hello,

I am receiving the following error when I would like to train Relevance Vector Machines with Linear Kernel (rvmLinear) with less-dimension training data.

Error message:

Error in chol.default(crossprod(Kr)/var + diag(1/thetatmp)) : 
  the leading minor of order 1498 is not positive definite 
14.
chol.default(crossprod(Kr)/var + diag(1/thetatmp)) 
13.
chol(crossprod(Kr)/var + diag(1/thetatmp)) 
12.
as.matrix(r) 
11.
backsolve(chol(crossprod(Kr)/var + diag(1/thetatmp)), diag(1, 
    n)) 
10.
.local(x, ...) 
9.
kernlab:::rvm(x = as.matrix(x), y = y, kernel = kernlab::vanilladot(), 
    ...) 
8.
kernlab:::rvm(x = as.matrix(x), y = y, kernel = kernlab::vanilladot(), 
    ...) at rvmLinear.R#10
7.
method$fit(x = x, y = y, wts = wts, param = tuneValue, lev = obsLevels, 
    last = last, classProbs = classProbs, ...) 
6.
createModel(x = subset_x(x, indexFinal), y = y[indexFinal], wts = weights[indexFinal], 
    method = models, tuneValue = bestTune, obsLevels = classLevels, 
    pp = ppOpt, last = TRUE, classProbs = trControl$classProbs, 
    sampling = trControl$sampling, ...) 
5.
system.time(finalModel <- createModel(x = subset_x(x, indexFinal), 
    y = y[indexFinal], wts = weights[indexFinal], method = models, 
    tuneValue = bestTune, obsLevels = classLevels, pp = ppOpt, 
    last = TRUE, classProbs = trControl$classProbs, sampling = trControl$sampling,  ... 
4.
train.default(x, y, weights = w, ...) 
3.
train(x, y, weights = w, ...) 
2.
train.formula(Response ~ ., data = data.set, method = "rvmLinear", 
    trControl = myTimeControl) 
1.
train(Response ~ ., data = data.set, method = "rvmLinear", trControl = myTimeControl) 
In addition: Warning message:
In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,  :
  There were missing values in resampled performance measures.
Timing stopped at: 787.9 4.32 809.9

This issue happened when I reduced the number of features from the training data. In the beginning, training data had 98 features, and rvmLinear model generated pretty good results. Then, I run a feature selection algorithm and it selected 7 features. After that, I reduced the number of features as selected by the algorithm; then, the model is not being trained with data has 7 features.

The code:

library(data.table)
library(caret)

data.meas <- as.data.frame(fread("data2011-1hMINZEROES.csv", header = TRUE, sep = ","))

library(doParallel)
cl <- makeCluster(parallel::detectCores(logical = FALSE))
registerDoParallel(cl)

# Preprocessing 
data <- data.meas
pre.data <- preProcess(data, method=c("range"))
pre.data.train <- as.data.frame(predict(pre.data, data))
attach(pre.data.train)

set.seed(2018)

myTimeControl <- trainControl(method = "cv",
                              allowParallel = TRUE,
                              verboseIter = F)

PSO.features <- c(22,30,57,85,87,97,98,99) # 99 is the "Response"

data.set <- as.data.frame(round(pre.data.train,3))
data.set <- data.set[,PSO.features]

rvmLinear.mod <- train(Response ~ .,
                       data = data.set,
                       method = "rvmLinear",
                       trControl = myTimeControl)

The following is the link to download the data. https://mega.nz/#!8wpl2DIZ!kXHjLpulF0eWjb9pxxO4UiAOw6T_NTDHIJ56GWyaGtc

Session Info

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_Singapore.1252  LC_CTYPE=English_Singapore.1252   
[3] LC_MONETARY=English_Singapore.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Singapore.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lars_1.2            dplyr_0.7.4         MASS_7.3-47         earth_4.6.2         plotmo_3.3.6       
 [6] TeachingDemos_2.10  plotrix_3.7         doParallel_1.0.11   iterators_1.0.8     foreach_1.4.3      
[11] caret_6.0-79        ggplot2_2.2.1       lattice_0.20-35     data.table_1.10.4-2

loaded via a namespace (and not attached):
  [1] quantregForest_1.3-7 plyr_1.8.4           igraph_1.2.1         lazyeval_0.2.1      
  [5] splines_3.4.1        svUnit_0.7-12        tfruns_1.3           BB_2014.10-1        
  [9] fastICA_1.2-1        TH.data_1.0-8        optimx_2013.8.7      import_1.1.0        
 [13] foba_0.1             magrittr_1.5         sfsmisc_1.1-2        recipes_0.1.2       
 [17] gower_0.1.2          dimRed_0.1.0         sandwich_2.4-0       strucchange_1.5-1   
 [21] colorspace_1.3-2     libcoin_1.0-1        kohonen_3.0.4        regpro_0.1.1        
 [25] jsonlite_1.5         brnn_0.6             qrnn_2.0.2           lme4_1.1-16         
 [29] bindr_0.1.1          zeallot_0.1.0        survival_2.41-3      zoo_1.8-1           
 [33] glue_1.2.0           DRR_0.0.3            mboost_2.8-1         gtable_0.2.0        
 [37] nnls_1.4             ipred_0.9-6          MatrixModels_0.4-1   spls_2.2-2          
 [41] kernlab_0.9-25       ddalpha_1.3.1.1      DEoptimR_1.0-8       abind_1.4-5         
 [45] SparseM_1.77         scales_0.5.0         setRNG_2013.9-1      penalized_0.9-50    
 [49] mvtnorm_1.0-7        evtree_1.0-6         Rcpp_0.12.16         Cubist_0.2.1        
 [53] reticulate_1.6       foreign_0.8-69       Formula_1.2-2        Rborist_0.1-8       
 [57] stats4_3.4.1         lava_1.6.1           stabs_0.6-3          prodlim_1.6.1       
 [61] glmnet_2.0-13        RColorBrewer_1.1-2   modeltools_0.2-21    elmNN_1.0           
 [65] rJava_0.9-9          pkgconfig_2.0.1      nnet_7.3-12          tidyselect_0.2.4    
 [69] rlang_0.2.0          reshape2_1.4.3       munsell_0.4.3        tools_3.4.1         
 [73] xgboost_0.6.4.1      party_1.2-4          ranger_0.9.0         pls_2.6-0           
 [77] denpro_0.9.2         broom_0.4.4          rqPen_2.0            stringr_1.3.0       
 [81] arm_1.9-3            kknn_1.3.1           monmlp_1.1.5         bst_0.3-14          
 [85] ModelMetrics_1.1.0   ncvreg_3.9-1         RSNNS_0.4-10         robustbase_0.92-8   
 [89] purrr_0.2.4          randomForest_4.6-14  bindrcpp_0.2.2       coin_1.2-2          
 [93] nlme_3.1-131         whisker_0.3-2        quantreg_5.35        nodeHarvest_0.7-3   
 [97] relaxo_0.1-2         msaenet_2.8          RcppRoll_0.2.2       leaps_3.0           
[101] compiler_3.4.1       e1071_1.6-8          tibble_1.4.2         stringi_1.1.7       
[105] superpc_1.09         partDSA_0.9.14       Matrix_1.2-10        tensorflow_1.5      
[109] keras_2.1.5          psych_1.8.3.3        nloptr_1.0.4         gbm_2.1.3           
[113] elasticnet_1.1       extraTrees_1.0.5     pillar_1.2.1         LiblineaR_2.10-8    
[117] ucminf_1.1-4         monomvn_1.9-7        R6_2.2.2             FCNN4R_0.6.2        
[121] codetools_0.2-15     assertthat_0.2.0     CVST_0.2-1           Rvmmin_2017-7.18    
[125] optextras_2016-8.8   withr_2.1.2          mnormt_1.5-5         multcomp_1.4-8      
[129] Rcgmin_2013-2.21     quadprog_1.5-5       dfoptim_2018.2-1     grid_3.4.1          
[133] rpart_4.1-11         timeDate_3043.102    tidyr_0.8.0          coda_0.19-1         
[137] class_7.3-14         minqa_1.2.4          inum_1.0-0           partykit_1.2-0      
[141] numDeriv_2016.8-1    lubridate_1.7.3      base64enc_0.1-3 
topepo commented 6 years ago

It is having trouble computing the underlying Cholesky decomposition.

There is colinearity in the data (although I would not have thought that it would be bad enough to be an issue). The first principal component in the data accounts for about 58% of the total variation, which indicates a fairly strong between-predictor correlation.

You could try giving rvm the complete set of PCA scores or remove a variable prior to fitting (see findCorrelation). It seems like an issue with rvm and the numerics rather than caret.

mertyagli commented 6 years ago

Hi Topepo,

Sorry for the late response. I am making a performance comparison study on different type of feature selection algorithms. Based on this error, whenever I run a feature selection algorithm, it is highly possible to see this error again. Because, they might select features that have colinearity. Am I right on this?

topepo commented 6 years ago

Yes. You could run an unsupervised filter on the data before using rvm using the preProc option. That would help reduce errors.