Error: length(unique(indexes)) == 1 is not TRUE for caret large ensemble of models #3

Closed amladv closed 10 years ago

amladv commented 10 years ago

First of all let me thank you for very needed ensemble package which you wrote

I have come out to an issue when attempting to train a large number of models all for regression pre-selecting the models that work I get from 60 to 70 functional models all listed in caret and all working individually. Considering this I attempt to run

greedy <- caretEnsemble(all.models, iter=1000L) 

I come out with the following message.

Error: length(unique(indexes)) == 1 is not TRUE

where from your code I go to the origin of the issue which is

Error: length(unique(indexes)) == 1 is not TRUE
length(unique(indexes) =2 

The length in an example for unique(indexes) ranges in [[1]] 212 and 218 in [[2]] while the length for the observations must correspond to any of these ranges as per the description.

Do you know if there is a way to correct this error from within as to include all models. This seems to be a model or group of models specific issue (maybe related to caret 6.0) as far as I can see.

Any help will be welcomed, Thank you

zachmayer commented 10 years ago

This could be an issue specific to caret 6.0-- it's been almost a year since I've updated this code, and it probably needs some work now.

Are you sure you're fitting all of the models with the exact same trainControl function? Can you post a reproducible example (perhaps using the iris dataset?)

amladv commented 10 years ago

I am using the same train control function for all models will do a reproducible example on iris shorty

amladv commented 10 years ago

Here it goes (very long 1st part I excludes all models that fail and produce and error for whatever reason as well as those that produce RMSE=NA for greedy. I am sure this part could be embeded somehow. (fairly new to code). Second part TEST again the models conditioned to beeing functional from first part. It does produces a similar error. Thanks again

ipak <- function(pkg){
    new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
    if (length(new.pkg)) 
        install.packages(new.pkg, dependencies = TRUE)
    sapply(pkg, require, character.only = TRUE)
 packages <-  
"arm","MASS","LogicReg","earth","RSNNS","neuralnet","nnet","qrnn","pls","spls", "elasticnet",   

install_github('caretEnsemble', 'zachmayer') 
cl <- makeCluster(detectCores())

X <-model.matrix(iris$Sepal.Length~iris$Sepal.Width+iris$Petal.Length)[,-1]
X <- data.frame(X)
Y <-iris$Sepal.Length
train<-runif(nrow (X))<=0.80
myControl <- trainControl(method='cv', number=folds, repeats=repeats, 
returnResamp='none', returnData=FALSE, savePredictions=TRUE, 
verboseIter=TRUE, allowParallel=TRUE, index=createMultiFolds(Y[train], k=folds, 
PP <- c('center','scale')
#Method Value: bag from package caret with tuning parameter vars (dual use)
model1 <- train(X[train,], Y[train], method='bag', trControl=myControl, preProcess=PP)
#Method Value: bagEarth from package caret with tuning parameters: nprune, degree (dual     
model2 <- train(X[train,], Y[train], method='bagEarth', trControl=myControl, preProcess=PP)
#Method Value:logicBag from package logicFS with tuning parameters:ntrees,nleaves (dual use)
model3 <- train(X[train,], Y[train], method='logicBag', trControl=myControl, preProcess=PP)
#Method Value: treebag from package ipred with no tuning parameters (dual use)
model4 <- train(X[train,], Y[train], method='treebag', trControl=myControl, preProcess=PP)
#Boosted Trees 
#Method Value:blackboost from package mboost with tuning parameters:maxdepth,mstop(dual   
model5 <- train(X[train,], Y[train], method='blackboost', trControl=myControl, preProcess=PP)
#Method Value:bstTree from package bst with tuning parameters:nu, maxdepth, mstop (dual  
model6 <- train(X[train,], Y[train], method='bstTree', trControl=myControl, preProcess=PP)
#Method Value:gbm from package gbm with tuning parameters:interaction,depth,     
n.trees.shrinkage (dual use) 
model7 <- train(X[train,], Y[train], method='gbm', trControl=myControl, preProcess=PP)
#Boosting (Non-Tree)
#Method Value: bstLs from package bst with tuning parameters: mstop, nu (dual use)
model8 <- train(X[train,], Y[train], method='bstLs', trControl=myControl, preProcess=PP)
#Method Value: bstSm from package bst with tuning parameters: nu, mstop (dual use)
model9 <- train(X[train,], Y[train], method='bstSm', trControl=myControl, preProcess=PP)
#Method Value: gamboost from package mboost with tuning parameters: prune, mstop (dual  
model10 <- train(X[train,], Y[train], method='gamboost', trControl=myControl, preProcess=PP)
#Method Value: glmboost from package mboost with tuning parameters: prune, mstop (dual use)
model11 <- train(X[train,], Y[train], method='glmboost', trControl=myControl, preProcess=PP)
#Elastic Net
#Method Value: glmnet from package glmnet with tuning parameters: alpha, lambda (dual use)
model12 <- train(X[train,], Y[train], method='glmnet', trControl=myControl, preProcess=PP)
#Gaussian Processes
#Method Value: gaussprLinear from package kernlab with no tuning parameters (dual use)
model13 <- train(X[train,], Y[train], method='gaussprLinear', trControl=myControl,     
#Method Value: gaussprPoly from package kernlab with tuning parameters: degree, scale (dual   
model14 <- train(X[train,], Y[train], method='gaussprPoly', trControl=myControl, preProcess=PP)
#Method Value: gaussprRadial from package kernlab with tuning parameter sigma (dual use)
model15 <- train(X[train,], Y[train], method='gaussprRadial', trControl=myControl,     
#Generalized additive model
#Method Value: gam from package mgcv with tuning parameters: select, method (dual use)
model16 <- train(X[train,], Y[train], method='gam', trControl=myControl, preProcess=PP)
#Method Value: gamLoess from package gam with tuning parameters: degree, span (dual use)
model17 <- train(X[train,], Y[train], method='gamLoess', trControl=myControl, preProcess=PP)
#Method Value: gamSpline from package gam with tuning parameter df (dual use)
model18 <- train(X[train,], Y[train], method='gamSpline', trControl=myControl, preProcess=PP)
#Generalized linear model
#Method Value: glm from package stats with no tuning parameters (dual use)
model19 <- train(X[train,], Y[train], method='glm', trControl=myControl, preProcess=PP)
#Method Value: bayesglm from package arm with no tuning parameters (dual use)
model20 <- train(X[train,], Y[train], method='bayesglm', trControl=myControl, preProcess=PP)
#Method Value: glmStepAIC from package MASS with no tuning parameters (dual use)
model21 <- train(X[train,], Y[train], method='glmStepAIC', trControl=myControl, preProcess=PP)
#Independent Component Regression
#Method Value: icr from package caret with tuning parameter n.comp (regression only)
model22 <- train(X[train,], Y[train], method='icr', trControl=myControl, preProcess=PP)
#K Nearest Neighbor
#Method Value: knn from package caret with tuning parameter k (dual use)
model23 <- train(X[train,], Y[train], method='knn', trControl=myControl, preProcess=PP)
#Linear Least Squares
#Method Value: leapBackward from package leaps with tuning parameter nvmax (regression   
model24 <- train(X[train,], Y[train], method='leapBackward', trControl=myControl,     
#Method Value: leapForward from package leaps with tuning parameter nvmax (regression only)
model25 <- train(X[train,], Y[train], method='leapForward', trControl=myControl, preProcess=PP)
#Method Value: leapSeq from package leaps with tuning parameter nvmax (regression only)
model26 <- train(X[train,], Y[train], method='leapSeq', trControl=myControl, preProcess=PP)
#Method Value: lm from package stats with no tuning parameters (regression only)
model27 <- train(X[train,], Y[train], method='lm', trControl=myControl, preProcess=PP)
#Method Value: lmStepAIC from package MASS with no tuning parameters (regression only)
model28 <- train(X[train,], Y[train], method='lmStepAIC', trControl=myControl, preProcess=PP)
#Method Value: rlm from package MASS with no tuning parameters (regression only)
model29 <- train(X[train,], Y[train], method='rlm', trControl=myControl, preProcess=PP)
#Logic Regression
#Method Value: logreg from package LogicReg with tuning parameters: treesize, ntrees (dual  
model30 <- train(X[train,], Y[train], method='logreg', trControl=myControl, preProcess=PP)
#Multivariate Adaptive Regression Spline
#Method Value: earth from package earth with tuning parameters: nprune, degree (dual use)
model31 <- train(X[train,], Y[train], method='earth', trControl=myControl, preProcess=PP)
#Method Value: gcvEarth from package earth with tuning parameter degree (dual use)
model32 <- train(X[train,], Y[train], method='gcvEarth', trControl=myControl, preProcess=PP)
#Neural Networks
#Method Value: avNNet from package caret with tuning parameters: size, bag, decay (dual use)
model33 <- train(X[train,], Y[train], method='avNNet', trControl=myControl, preProcess=PP)
#Method Value: mlp from package RSNNS with tuning parameter size (dual use)
model34 <- train(X[train,], Y[train], method='mlp', trControl=myControl, preProcess=PP)
#Method Value: mlpWeightDecay from package RSNNS with tuning parameters: decay, size   
(dual use)
model35 <- train(X[train,], Y[train], method='mlpWeightDecay', trControl =myControl, 
trace=FALSE, preProcess=PP)
#Method Value: neuralnet package neuralnet with tuning parameters:layer2,layer1,layer3     
(Regression only)
model36 <- train(X[train,], Y[train], method='neuralnet', trControl=myControl, preProcess=PP)
#Method Value: nnet from package nnet with tuning parameters: size, decay (dual use)
model37 <- train(X[train,], Y[train], method='nnet', trControl=myControl, preProcess=PP)
#Method Value: pcaNNet from package caret with tuning parameters: size, decay (dual use)
model38 <- train(X[train,], Y[train], method='pcaNNet', trControl=myControl, preProcess=PP)
#Partial Least Squares 
#Method Value: kernelpls from package pls with tuning parameter ncomp (dual use)
model40 <- train(X[train,], Y[train], method='kernelpls', trControl=myControl, preProcess=PP)
#Method Value: pls from package pls with tuning parameter ncomp (dual use)
model41 <- train(X[train,], Y[train], method='pls', trControl=myControl, preProcess=PP)
#Method Value: simpls from package pls with tuning parameter ncomp (dual use)
model42 <- train(X[train,], Y[train], method='simpls', trControl=myControl, preProcess=PP)
#Method Value: spls from package spls with tuning parameters: eta, kappa, K (dual use)
model43 <- train(X[train,], Y[train], method='spls', trControl=myControl, preProcess=PP)
#Method Value: widekernelpls from package pls with tuning parameter ncomp (dual use)
model44 <- train(X[train,], Y[train], method='widekernelpls', trControl=myControl,    
#Penalized Linear Models 
#Method Value: enet from package elasticnet with tuning parameters: fraction, lambda (  
regression only)
model45 <- train(X[train,], Y[train], method='enet', trControl=myControl, preProcess=PP)
#Method Value: foba from package foba with tuning parameters: lambda, k (regression only)
model46 <- train(X[train,], Y[train], method='foba', trControl=myControl, preProcess=PP)
#Method Value: krlsPoly from package KRLS with tuning parameters: lambda, degree    
(regression only)
model47 <- train(X[train,], Y[train], method='krlsPoly', trControl=myControl, preProcess=PP)
#Method Value: krlsRadial from package KRLS with tuning parameters: sigma, lambda 
(regression only)
model48 <- train(X[train,], Y[train], method='krlsRadial', trControl=myControl, preProcess=PP)
#Method Value: lars from package lars with tuning parameter fraction (regression only)
model49 <- train(X[train,], Y[train], method='lars', trControl=myControl, preProcess=PP)
#Method Value: lars2 from package lars with tuning parameter step (regression only)
model50 <- train(X[train,], Y[train], method='lars2', trControl=myControl, preProcess=PP)
#Method Value: lasso from package elasticnet with tuning parameter fraction (regression only)
model51 <- train(X[train,], Y[train], method='lasso', trControl=myControl, preProcess=PP)
#Method Value: penalized  package penalized with tuning parameters: lambda1, l   
lambda2(regression only)
model52 <- train(X[train,], Y[train], method='penalized', trControl=myControl, preProcess=PP)
#Method Value: relaxo from package relaxo with tuning parameters: lambda, phi (regression 
model53<- train(X[train,], Y[train], method='relaxo', trControl=myControl, preProcess=PP)
#Method Value: ridge from package elasticnet with tuning parameter lambda (regression only)
model54 <- train(X[train,], Y[train], method='ridge', trControl=myControl, preProcess=PP)
#Principal Component Regression
#Method Value: pcr from package pls with tuning parameter ncomp (regression only)
model55<- train(X[train,], Y[train], method='pcr', trControl=myControl, preProcess=PP)
#Projection Pursuit Regression
#Method Value: ppr from package stats with tuning parameter nterms (regression only)
model56 <- train(X[train,], Y[train], method='ppr', trControl=myControl, preProcess=PP)
#Radial Basis Function Networks
#Method Value: rbf from package RSNNS with tuning parameter size (dual use)
model57 <- train(X[train,], Y[train], method='rbfDDA', trControl=myControl, preProcess=PP)
#Random Forests
#Method Value: Boruta from package Boruta with tuning parameter mtry (dual use)
model58 <- train(X[train,], Y[train], method='Boruta', trControl=myControl, preProcess=PP)
#Method Value: cforest from package party with tuning parameter mtry (dual use)
model59 <- train(X[train,], Y[train], method='cforest', trControl=myControl, preProcess=PP)
#Method Value: parRF from package randomForest with tuning parameter mtry (dual use)
model60 <- train(X[train,], Y[train], method='parRF', trControl=myControl, preProcess=PP)
#Method Value: qrf from package quantregForest with tuning parameter mtry (regression only)
model61 <- train(X[train,], Y[train], method='qrf', trControl=myControl, preProcess=PP)
#Method Value: rf from package randomForest with tuning parameter mtry (dual use)
model62 <- train(X[train,], Y[train], method='rf', trControl=myControl, preProcess=PP)
#Method Value: RRF from package RRF with tuning parameters: mtry, coefReg, coefImp (dual 
model63 <- train(X[train,], Y[train], method='RRF', trControl=myControl, preProcess=PP)
#Method Value: RRFglobal from package RRF with tuning parameters: coefReg, mtry (dual use)
model64 <- train(X[train,], Y[train], method='RRFglobal', trControl=myControl, preProcess=PP)
#Recursive Partitioning
#Method Value: ctree from package party with tuning parameter mincriterion (dual use)
model65 <- train(X[train,], Y[train], method='ctree', trControl=myControl, preProcess=PP)
#Method Value: ctree2 from package party with tuning parameter maxdepth (dual use)
model66 <- train(X[train,], Y[train], method='ctree2', trControl=myControl, preProcess=PP)
#Method Value: evtree from package evtree with tuning parameter alpha (dual use)
model67 <- train(X[train,], Y[train], method='evtree', trControl=myControl, preProcess=PP)
#Method Value: obliqueTree from package oblique.Tree with tuning parameters:variable,    
selection,oblique,splits (dual)
model69 <- train(X[train,], Y[train], method='oblique.Tree', trControl=myControl, preProcess=PP)
#Method Value: partDSA from package partDSA with tuning parameters:, MPD 
(dual use)
model70 <- train(X[train,], Y[train], method='partDSA', trControl=myControl, preProcess=PP)
#Method Value: rpart from package rpart with tuning parameter cp (dual use)
model71 <- train(X[train,], Y[train], method='rpart', trControl=myControl, preProcess=PP)
#Method Value: rpart2 from package rpart with tuning parameter maxdepth (dual use)
model72 <- train(X[train,], Y[train], method='rpart2', trControl=myControl, preProcess=PP)
#Relevance Vector Machines
#Method Value: rvmLinear from package kernlab with no tuning parameters (regression only)
model73 <- train(X[train,], Y[train], method='rvmLinear', trControl=myControl, preProcess=PP)
#Method Value: rvmPoly from package kernlab with tuning parameters: scale, degree 
(regression only)
model74 <- train(X[train,], Y[train], method='rvmPoly', trControl=myControl, preProcess=PP)
#Method Value: rvmRadial from package kernlab with tuning parameter sigma (regression only)
model75 <- train(X[train,], Y[train], method='rvmRadial', trControl=myControl, preProcess=PP)
#Rule-Based Models
#Method Value: cubist  package Cubist with tuning parameters: committees,   
neighbors(regression only)
model76 <- train(X[train,], Y[train], method='cubist', trControl=myControl, preProcess=PP)
#Method Value: M5 from package RWeka with tuning parameters: rules,pruned,     
smoothed(regression only)
model77 <- train(X[train,], Y[train], method='M5', trControl=myControl, preProcess=PP)
#Method Value: M5Rules from package RWeka with tuning parameters: pruned,    
smoothed(regression only)
model78 <- train(X[train,], Y[train], method='M5Rules', trControl=myControl, preProcess=PP)
#Self-Organizing Maps
#Method Value: bdk from package kohonen with tuning parameters: topo, ydim, xweight,   
xdim(dual use)
model79 <- train(X[train,], Y[train], method='bdk', trControl=myControl, preProcess=PP)
#Method Value: xyf from package kohonen with tuning parameters: xdim, ydim, topo,   
xweight(dual use)
model80 <- train(X[train,], Y[train], method='xyf', trControl=myControl, preProcess=PP)
#Supervised Principal Components
#Method Value: superpc  package superpc with tuning parameters: Threshold, n.components 
model81 <- train(X[train,], Y[train], method='superpc', trControl=myControl, preProcess=PP)
#Support Vector Machines
#Method Value: svmLinear from package kernlab with tuning parameter C (dual use)
model82 <- train(X[train,], Y[train], method='svmLinear', trControl=myControl, preProcess=PP)
#Method Value: svmPoly from package kernlab with tuning parameters: degree, scale, C (dual 
model83 <- train(X[train,], Y[train], method='svmPoly', trControl=myControl, preProcess=PP)
#Method Value: svmRadial from package kernlab with tuning parameters: C, sigma (dual use)
model84 <- train(X[train,], Y[train], method='svmRadial', trControl= myControl, preProcess=PP)
#Method Value: svmRadialCost from package kernlab with tuning parameter C (dual use)
model85 <- train(X[train,], Y[train], method='svmRadialCost', trControl=myControl, 
Model1<-if (!exists("model1")) {0
} else if ($results$RMSE[1])) {0
}else {model1
Model2<-if (!exists("model2")) {0
} else if ($results$RMSE[1])) {0
}else {model2
Model3<-if (!exists("model3")) {0
} else if ($results$RMSE[1])) {0
}else {model3
Model4<-if (!exists("model4")) {0
} else if ($results$RMSE[1])) {0
}else {model4
Model5<-if (!exists("model5")) {0
} else if ($results$RMSE[1])) {0
} else {model5
Model6<-if (!exists("model6")) {0
} else if ($results$RMSE[1])) {0
} else {model6
Model7<-if (!exists("model7")) {0
} else if ($results$RMSE[1])) {0
} else {model7
Model8<-if (!exists("model8")) {0
} else if ($results$RMSE[1])) {0
} else {model8
Model9<-if (!exists("model9")) {0
} else if ($results$RMSE[1])) {0
} else {model9
Model10<-if (!exists("model10")) {0
} else if ($results$RMSE[1])) {0
} else {model10
Model11<-if (!exists("model11")) {0
} else if ($results$RMSE[1])) {0
} else {model11
Model12<-if (!exists("model12")) {0
} else if ($results$RMSE[1])) {0
} else {model12
Model13<-if (!exists("model13")) {0
} else if ($results$RMSE[1])) {0
} else {model13
Model14<-if (!exists("model14")) {0
} else if ($results$RMSE[1])) {0
} else {model14
Model15<-if (!exists("model15")) {0
} else if ($results$RMSE[1])) {0
} else {model15
Model16<-if (!exists("model16")) {0
} else if ($results$RMSE[1])) {0
} else {model16
Model17<-if (!exists("model17")) {0
} else if ($results$RMSE[1])) {0
}else {model17
Model18<-if (!exists("model18")) {0
} else if ($results$RMSE[1])) {0
} else {model18
Model19<-if (!exists("model19")) {0
} else if ($results$RMSE[1])) {0
} else {model19
Model20<-if (!exists("model20")) {0
} else if ($results$RMSE[1])) {0
} else {model20
Model21<-if (!exists("model21")) {0
} else if ($results$RMSE[1])) {0
} else {model21
Model22<-if (!exists("model22")) {0
} else if ($results$RMSE[1])) {0
} else {model22
Model23<-if (!exists("model23")) {0
} else if ($results$RMSE[1])) {0
} else {model23
Model24<-if (!exists("model24")) {0
} else if ($results$RMSE[1])) {0
} else {model24
Model25<-if (!exists("model25")) {0
} else if ($results$RMSE[1])) {0
} else {model25
Model26<-if (!exists("model26")) {0
} else if ($results$RMSE[1])) {0
} else {model26
Model27<-if (!exists("model27")) {0
} else if ($results$RMSE[1])) {0
} else {model27
Model28<-if (!exists("model28")) {0
} else if ($results$RMSE[1])) {0
} else {model28
Model29<-if (!exists("model29")) {0
} else if ($results$RMSE[1])) {0
} else {model29
Model30<-if (!exists("model30")) {0
} else if ($results$RMSE[1])) {0
} else {model30
Model31<-if (!exists("model31")) {0
} else if ($results$RMSE[1])) {0
} else {model31
Model32<-if (!exists("model32")) {0
} else if ($results$RMSE[1])) {0
} else {model32
Model33<-if (!exists("model33")) {0
} else if ($results$RMSE[1])) {0
} else {model33
Model34<-if (!exists("model34")) {0
} else if ($results$RMSE[1])) {0
} else {model34
Model35<-if (!exists("model35")) {0
} else if ($results$RMSE[1])) {0
} else {model35
Model36<-if (!exists("model36")) {0
} else if ($results$RMSE[1])) {0
} else {model36
Model37<-if (!exists("model37")) {0
} else if ($results$RMSE[1])) {0
} else {model37
Model38<-if (!exists("model38")) {0
} else if ($results$RMSE[1])) {0
} else {model38
Model39<-if (!exists("model39")) {0
} else if ($results$RMSE[1])) {0
} else {model39
Model40<-if (!exists("model40")) {0
} else if ($results$RMSE[1])) {0
} else {model40
Model41<-if (!exists("model41")) {0
} else if ($results$RMSE[1])) {0
} else {model41
Model42<-if (!exists("model42")) {0
} else if ($results$RMSE[1])) {0
} else {model42
Model43<-if (!exists("model43")) {0
} else if ($results$RMSE[1])) {0
} else {model43
Model44<-if (!exists("model44")) {0
} else if ($results$RMSE[1])) {0
} else {model44
Model45<-if (!exists("model45")) {0
} else if ($results$RMSE[1])) {0
} else {model45
Model46<-if (!exists("model46")) {0
} else if ($results$RMSE[1])) {0
} else {model46
Model47<-if (!exists("model47")) {0
} else if ($results$RMSE[1])) {0
} else {model47
Model48<-if (!exists("model48")) {0
} else if ($results$RMSE[1])) {0
} else {model48
Model49<-if (!exists("model49")) {0
} else if ($results$RMSE[1])) {0
} else {model49
Model50<-if (!exists("model50")) {0
} else if ($results$RMSE[1])) {0
} else {model50
Model51<-if (!exists("model51")) {0
} else if ($results$RMSE[1])) {0
} else {model51
Model52<-if (!exists("model52")) {0
} else if ($results$RMSE[1])) {0
} else {model52
Model53<-if (!exists("model53")) {0
} else if ($results$RMSE[1])) {0
} else {model53
Model54<-if (!exists("model54")) {0
} else if ($results$RMSE[1])) {0
} else {model54
Model55<-if (!exists("model55")) {0
} else if ($results$RMSE[1])) {0
} else {model55
Model56<-if (!exists("model56")) {0
} else if ($results$RMSE[1])) {0
} else {model56
Model57<-if (!exists("model57")) {0
} else if ($results$RMSE[1])) {0
} else {model57
Model58<-if (!exists("model58")) {0
} else if ($results$RMSE[1])) {0
} else {model58
Model59<-if (!exists("model59")) {0
} else if ($results$RMSE[1])) {0
} else {model59
Model60<-if (!exists("model60")) {0
} else if ($results$RMSE[1])) {0
} else {model60
Model61<-if (!exists("model61")) {0
} else if ($results$RMSE[1])) {0
} else {model61
Model62<-if (!exists("model62")) {0
} else if ($results$RMSE[1])) {0
} else {model62
Model63<-if (!exists("model63")) {0
} else if ($results$RMSE[1])) {0
} else {model63
Model64<-if (!exists("model64")) {0
} else if ($results$RMSE[1])) {0
} else {model64
Model65<-if (!exists("model65")) {0
} else if ($results$RMSE[1])) {0
} else {model65
Model66<-if (!exists("model66")) {0
} else if ($results$RMSE[1])) {0
} else {model66
Model67<-if (!exists("model67")) {0
} else if ($results$RMSE[1])) {0
} else {model67
Model68<-if (!exists("model68")) {0
} else if ($results$RMSE[1])) {0
} else {model68
Model69<-if (!exists("model69")) {0
} else if ($results$RMSE[1])) {0
} else {model69
Model70<-if (!exists("model70")) {0
} else if ($results$RMSE[1])) {0
} else {model70
Model71<-if (!exists("model71")) {0
} else if ($results$RMSE[1])) {0
} else {model71
Model72<-if (!exists("model72")) {0
} else if ($results$RMSE[1])) {0
} else {model72
Model73<-if (!exists("model73")) {0
} else if ($results$RMSE[1])) {0
} else {model73
Model74<-if (!exists("model74")) {0
} else if ($results$RMSE[1])) {0
} else {model74
Model75<-if (!exists("model75")) {0
} else if ($results$RMSE[1])) {0
} else {model75
Model76<-if (!exists("model76")) {0
} else if ($results$RMSE[1])) {0
} else {model76
Model77<-if (!exists("model77")) {0
} else if ($results$RMSE[1])) {0
} else {model77
Model78<-if (!exists("model78")) {0
} else if ($results$RMSE[1])) {0
} else {model78
Model79<-if (!exists("model79")) {0
} else if ($results$RMSE[1])) {0
} else {model79
Model80<-if (!exists("model80")) {0
} else if ($results$RMSE[1])) {0
} else {model80
Model81<-if (!exists("model81")) {0
} else if ($results$RMSE[1])) {0
} else {model81
Model82<-if (!exists("model82")) {0
} else if ($results$RMSE[1])) {0
} else {model82
Model83<-if (!exists("model83")) {0
} else if ($results$RMSE[1])) {0
} else {model83
Model84<-if (!exists("model84")) {0
} else if ($results$RMSE[1])) {0
} else {model84
Model85<-if (!exists("model85")) {0
} else if ($results$RMSE[1])) {0
} else {model85
X <-model.matrix(iris$Sepal.Length~iris$Sepal.Width+iris$Petal.Length)[,-1]
X <- data.frame(X)
Y <-iris$Sepal.Length
train<-runif(nrow (X))<=0.80
myControl <- trainControl(method='cv', number=folds, repeats=repeats, returnResamp='none',  
returnData=FALSE, savePredictions=TRUE, verboseIter=TRUE, allowParallel=TRUE,   
index=createMultiFolds(Y[train], k=folds, times=repeats))
PP <- c('center','scale')
model1<-if(Model1!= "0")  train(X[train,], Y[train], method='bag', trControl=myControl,   
preProcess=PP) else 0
model2<-if(Model2!= "0") train(X[train,], Y[train], method='bagEarth', trControl=myControl,         
preProcess=PP) else 0 
model3<-if(Model3!= "0") train(X[train,], Y[train], method='logicBag', trControl=myControl,     
preProcess=PP) else 0
model4<-if(Model4!= "0") train(X[train,], Y[train], method='treebag', trControl=myControl,  
preProcess=PP)else 0
model5<-if(Model5!= "0") train(X[train,], Y[train], method='blackboost', trControl=myControl, 
preProcess=PP)else 0
model6<-if(Model6!= "0") train(X[train,], Y[train], method='bstTree', trControl=myControl, 
preProcess=PP) else 0
model7<-if(Model7!= "0") train(X[train,], Y[train], method='gbm', trControl=myControl,   
preProcess=PP) else 0
model8<-if(Model8!= "0") train(X[train,], Y[train], method='bstLs', trControl=myControl,   
preProcess=PP) else 0
model9<-if(Model9!= "0") train(X[train,], Y[train], method='bstSm', trControl=myControl, 
preProcess=PP) else 0
model10<-if(Model10!= "0")train(X[train,], Y[train], method='gamboost', trControl=myControl, 
preProcess=PP) else 0
model11<-if(Model11!= "0")  train(X[train,], Y[train], method='glmboost', trControl=myControl,   
preProcess=PP)else 0
model12<-if(Model12!= "0")  train(X[train,], Y[train], method='glmnet', trControl=myControl,   
preProcess=PP)else 0
model13<-if(Model13!= "0")  train(X[train,], Y[train], method='gaussprLinear',   
trControl=myControl, preProcess=PP)else 0
model14<-if(Model14!= "0")  train(X[train,], Y[train], method='gaussprPoly', trControl=myControl, 
preProcess=PP) else 0
model15<-if(Model15!= "0")  train(X[train,], Y[train], method='gaussprRadial', 
trControl=myControl, preProcess=PP)else 0 
model16<-if(Model16!= "0")  train(X[train,], Y[train], method='gam', trControl=myControl, 
preProcess=PP) else 0
model17<-if(Model17!= "0")  train(X[train,], Y[train], method='gamLoess', trControl=myControl, 
preProcess=PP)else 0
model18<-if(Model18!= "0")  train(X[train,], Y[train], method='gamSpline', trControl=myControl, 
preProcess=PP) else 0
model19<-if(Model19!= "0")  train(X[train,], Y[train], method='glm', trControl=myControl, 
preProcess=PP) else 0
model20<-if(Model20!= "0")  train(X[train,], Y[train], method='bayesglm', trControl=myControl, 
preProcess=PP) else 0
model21<-if(Model21!= "0")  train(X[train,], Y[train], method='glmStepAIC', trControl=myControl, 
preProcess=PP) else 0
model22<-if(Model22!= "0")  train(X[train,], Y[train], method='icr', trControl=myControl, 
preProcess=PP) else 0
model23<-if(Model23!= "0")  train(X[train,], Y[train], method='knn', trControl=myControl,   
preProcess=PP) else 0
model24<-if(Model24!= "0")  train(X[train,], Y[train], method='leapBackward', 
trControl=myControl, preProcess=PP) else 0
model25<-if(Model25!= "0")  train(X[train,], Y[train], method='leapForward', trControl=myControl,   
preProcess=PP) else 0 
model26<-if(Model26!= "0")  train(X[train,], Y[train], method='leapSeq', trControl=myControl, 
preProcess=PP) else 0
model27<-if(Model27!= "0")  train(X[train,], Y[train], method='lm', trControl=myControl,   
preProcess=PP) else 0
model28<-if(Model28!= "0")  train(X[train,], Y[train], method='lmStepAIC', trControl=myControl, 
preProcess=PP) else 0
model29<-if(Model29!= "0")  train(X[train,], Y[train], method='rlm', trControl=myControl,   
preProcess=PP) else 0
model30<-if(Model30!= "0")  train(X[train,], Y[train], method='logreg', trControl=myControl,   
preProcess=PP) else 0
model31<-if(Model31!= "0")  train(X[train,], Y[train], method='earth', trControl=myControl,     
preProcess=PP) else 0
model32<-if(Model32!= "0")  train(X[train,], Y[train], method='gcvEarth', trControl=myControl,    
preProcess=PP) else 0
model33<-if(Model33!= "0")  train(X[train,], Y[train], method='avNNet', trControl=myControl, 
preProcess=PP) else 0
model34<-if(Model34!= "0")  train(X[train,], Y[train], method='mlp', trControl=myControl, 
preProcess=PP) else 0
model35<-if(Model35!= "0")  train(X[train,], Y[train], method='mlpWeightDecay', trControl  
=myControl, trace=FALSE, preProcess=PP) else 0
model36<-if(Model36!= "0")  train(X[train,], Y[train], method='neuralnet', trControl=myControl,    
preProcess=PP) else 0
model37<-if(Model37!= "0")  train(X[train,], Y[train], method='nnet', trControl=myControl,     
preProcess=PP) else 0
model38<-if(Model38!= "0")  train(X[train,], Y[train], method='pcaNNet', trControl=myControl,  
preProcess=PP) else 0
model40<-if(Model40!= "0") train(X[train,], Y[train], method='kernelpls', trControl=myControl, 
preProcess=PP) else 0
model41<-if(Model41!= "0") train(X[train,], Y[train], method='pls', trControl=myControl, 
preProcess=PP) else 0
model42<-if(Model42!= "0") train(X[train,], Y[train], method='simpls', trControl=myControl, 
preProcess=PP) else 0
model43<-if(Model43!= "0") train(X[train,], Y[train], method='spls', trControl=myControl, 
preProcess=PP) else 0
model44<-if(Model44!= "0") train(X[train,], Y[train], method='widekernelpls', trControl=myControl, 
preProcess=PP)else 0
model45<-if(Model45!= "0") train(X[train,], Y[train], method='enet', trControl=myControl, 
preProcess=PP) else 0
model46<-if(Model46!= "0") train(X[train,], Y[train], method='foba', trControl=myControl, 
preProcess=PP) else 0
model47<-if(Model47!= "0") train(X[train,], Y[train], method='krlsPoly', trControl=myControl, 
preProcess=PP) else 0
model48<-if(Model48!= "0") train(X[train,], Y[train], method='krlsRadial', trControl=myControl,   
preProcess=PP) else 0
model49<-if(Model49!= "0") train(X[train,], Y[train], method='lars', trControl=myControl,   
preProcess=PP) else 0
model50<-if(Model50!= "0") train(X[train,], Y[train], method='lars2', trControl=myControl, 
preProcess=PP) else 0
model51<-if(Model51!= "0") train(X[train,], Y[train], method='lasso', trControl=myControl,   
preProcess=PP) else 0
model52<-if(Model52!= "0") train(X[train,], Y[train], method='penalized', trControl=myControl, 
preProcess=PP) else 0
model53<-if(Model53!= "0") train(X[train,], Y[train], method='relaxo', trControl=myControl, 
preProcess=PP) else 0
model54<-if(Model54!= "0") train(X[train,], Y[train], method='ridge', trControl=myControl, 
preProcess=PP) else 0
model55<-if(Model55!= "0") train(X[train,], Y[train], method='pcr', trControl=myControl, 
preProcess=PP) else 0
model56<-if(Model56!= "0") train(X[train,], Y[train], method='ppr', trControl=myControl, 
preProcess=PP) else 0
model57<-if(Model57!= "0") train(X[train,], Y[train], method='rbfDDA', trControl=myControl, 
preProcess=PP) else 0
model58<-if(Model58!= "0") train(X[train,], Y[train], method='Boruta', trControl=myControl, 
preProcess=PP) else 0
model59<-if(Model59!= "0") train(X[train,], Y[train], method='cforest', trControl=myControl, 
preProcess=PP) else 0
model60<-if(Model60!= "0") train(X[train,], Y[train], method='parRF', trControl=myControl,   
preProcess=PP) else 0
model61<-if(Model61!= "0") train(X[train,], Y[train], method='qrf', trControl=myControl,   
preProcess=PP) else 0
model62<-if(Model62!= "0") train(X[train,], Y[train], method='rf', trControl=myControl,    
preProcess=PP) else 0
model63<-if(Model63!= "0") train(X[train,], Y[train], method='RRF', trControl=myControl, 
preProcess=PP) else 0
model64<-if(Model64!= "0") train(X[train,], Y[train], method='RRFglobal', trControl=myControl, 
preProcess=PP) else 0
model65<-if(Model65!= "0") train(X[train,], Y[train], method='ctree', trControl=myControl, 
preProcess=PP) else 0
model66<-if(Model66!= "0") train(X[train,], Y[train], method='ctree2', trControl=myControl, 
preProcess=PP) else 0
model67<-if(Model67!= "0") train(X[train,], Y[train], method='evtree', trControl=myControl, 
preProcess=PP) else 0
model69<-if(Model69!= "0") train(X[train,], Y[train], method='oblique.Tree', trControl=myControl, 
preProcess=PP) else 0
model70<-if(Model70!= "0") train(X[train,], Y[train], method='partDSA', trControl=myControl, 
preProcess=PP) else 0
model71<-if(Model71!= "0") train(X[train,], Y[train], method='rpart', trControl=myControl, 
preProcess=PP) else 0
model72<-if(Model72!= "0") train(X[train,], Y[train], method='rpart2', trControl=myControl,     
preProcess=PP) else 0
model73<-if(Model73!= "0") train(X[train,], Y[train], method='rvmLinear', trControl=myControl,    
preProcess=PP) else 0
model74<-if(Model74!= "0") train(X[train,], Y[train], method='rvmPoly', trControl=myControl, 
preProcess=PP) else 0
model75<-if(Model75!= "0") train(X[train,], Y[train], method='rvmRadial', trControl=myControl, 
preProcess=PP) else 0
model76<-if(Model76!= "0") train(X[train,], Y[train], method='cubist', trControl=myControl, 
preProcess=PP) else 0
model77<-if(Model77!= "0") train(X[train,], Y[train], method='M5', trControl=myControl, 
preProcess=PP) else 0
model78<-if(Model78!= "0") train(X[train,], Y[train], method='M5Rules', trControl=myControl, 
preProcess=PP) else 0
model79<-if(Model79!= "0") train(X[train,], Y[train], method='bdk', trControl=myControl, 
preProcess=PP) else 0
model80<-if(Model80!= "0") train(X[train,], Y[train], method='xyf', trControl=myControl, 
preProcess=PP) else 0
model81<-if(Model81!= "0") train(X[train,], Y[train], method='superpc', trControl=myControl, 
preProcess=PP) else 0
model82<-if(Model82!= "0") train(X[train,], Y[train], method='svmLinear', trControl=myControl, 
preProcess=PP) else 0
model83<-if(Model83!= "0") train(X[train,], Y[train], method='svmPoly', trControl=myControl, 
preProcess=PP) else 0
model84<-if(Model84!= "0") train(X[train,], Y[train], method='svmRadial', trControl= myControl, 
preProcess=PP) else 0
model85<-if(Model85!= "0") train(X[train,], Y[train], method='svmRadialCost', 
trControl=myControl, preProcess=PP)else 0

all.models<- list (model1,model2,model3,model4,model5,model6,model7,model8, model9,    
model10, model11,model12,model13,model14, model15, model16,model17,model18,   
model19,model20,model21, model22,model23,model24,model25,model26,model27,  
model28,model29,model30, model31,model32, model33, model34,model35,model36,model37, 
model38, model40,model41, model42, model43, model44,model45, model46,model47, 
model48,model49,model50,model51,model52, model53, model54,model55,model56, 
model57,model58, model59,model60,model61,model62, model63,model64, 
model65,model66,model67, model69,model70,model71,model72,model73,model74,   

all.models<- all.models[all.models != "0"]

 names(all.models) <- sapply(all.models, function(x) x$method)

sort(sapply(all.models, function(x) min(x$results$RMSE)))

greedy <- caretEnsemble(all.models, iter=1000L)
sort(greedy$weights, decreasing=TRUE)
jknowles commented 10 years ago

The problem is that you are not passing custom seeds for the resampling in trainControl. You need to pass custom seeds so that all resamples are split using the same random split of the data. See the documentation of seeds for trainControl. I'll have an example of this available to look at shortly.

zachmayer commented 10 years ago

Thanks Jared.

The other option is to explicitly pass the indexes to trainControl. This is not clear in the function's documentation.

amladv commented 10 years ago

Understood, many thanks to both

jknowles commented 10 years ago

Ah yes,

That's another way to do it. I will need to keep that in mind as an option to pass through to the caretList function. As it is now, I built a helper function to pass appropriately constant (but random) seeds to trainControl on the fly. This will need some attention, but should make a big script like that above easy to execute in a few lines of code instead.

amladv commented 10 years ago

I came across I believe an unrelated issue as the script finished running.

greedy <- caretEnsemble(all.models, iter=1000L) Error en colnames<-(*tmp*, value = c("bagEarth", "treebag", "blackboost", : attempt to set 'colnames' on an object with less than two dimensions.

May this be related to the a model/s interaction with the matrix of observations and predictions.

amladv commented 10 years ago

This last issue seems to be model related (maybe caret 6.0) the error when applying caret individually for all models 4 models gave the following error.

all.models<- list (model81)
greedy <- caretEnsemble(all.models, iter=1000L)
Error en pred + X : arreglos de dimensón no compatibles

Having tried with another data set not iris the problem occurs in the same models with one another model failing occurring sometimes.

jknowles commented 10 years ago

@amladv try running a modification like this:

mseeds <- vector(mode = "list", length = 11)
for(i in 1:10) mseeds[[i]] <-, 3)
mseeds[[11]] <-, 1)
myControl = trainControl(method = "cv", number = 10, repeats = 1, 
                      p = 0.75, savePrediction = TRUE, 
                      classProbs = FALSE, returnResamp = "final", 
                      returnData = TRUE, seeds = mseeds)

You'll need to modify for(i in 1:10) mseeds[[i]] <-, M) for your particular data. The loop should be the size of the number of resamples you are doing (folds * repeats) and the M should be the number of elements in your largest tuning grid of any model.

I'm working on #6 which includes a function that can set these seeds for you -- but it hasn't been tested in an extreme case like yours. Give it a look though and see if it might work for you.

amladv commented 10 years ago

Thank you Jared and Zach. It does work adding index=createMultiFolds(Y[train], k=folds, times=repeats to control.

I have another question.

I would like to add the predict function once the ensemble training is done. in the form

predict(greedy, newdata=New Predicting variables). 

I encounter that the model seems to work but obtain the following error regardless of the models I use.

Error en `colnames<-`(`*tmp*`, value = c("model1", "model2", "model3",  : 
attempt to set 'colnames' on an object with less than two dimensions

Maybe is because the predict function can be specific to different types of models and this is in fact not possible. Unsure on this. In addition I have to center and scale the new data.

zachmayer commented 10 years ago

Can please post a reproducible example of the error?

Thank you.

amladv commented 10 years ago


Predict works, I made a mistake on feeding one of the models, appologies and thanks again

zachmayer commented 10 years ago

hkreeves commented 10 years ago


Could you please tell more about your fix to the following issue:

greedy <- caretEnsemble(all.models, iter=1000L) Error en colnames<-(tmp, value = c("bagEarth", "treebag", "blackboost", : attempt to set 'colnames' on an object with less than two dimensions.

I had exactly the same problem when blending only two models, a glmnet and a rf. Both models have the same trControl and identical indice.

mycontrol$index <- createMultiFolds(train$Y, k=10, times=3) ... log.rf.mix <- caretEnsemble(models) Error in colnames<-(*tmp*, value = c("glmnet", "rf")) : attempt to set 'colnames' on an object with less than two dimensions

Could you please elaborate on "mistake on feeding one of the models"?

hkreeves commented 10 years ago

I traced down the issue. It turned out the error occured in makePredObsMatrix(). Specifically, when combining pred of each model object

modelLibrary <- extractBestPreds(list_of_models)

I happened to have different #row of each element in modelLibrary. My RF caret object did not contained all resample groups (missing last group Fold10.Rep3) possibly due to the multicore parallel computing.

careEnsemble itself should be perfectly fine after all. My suggestion would be to implement better error explanation, so as to inform the user where to look for the problem. I notice that @zachmayer already left a "todo" in the code:

  #Insert checks here: observeds are all equal, row indexes are equal, Resamples are equal

So have no worry about it coming soon.

jknowles commented 10 years ago

@hkreeves Can you open a new issue for this, so it doesn't get lost on this issue report?