zachmayer / caretEnsemble

caret models all the way down :turtle:
Other
226 stars 75 forks source link

Error in data.frame arguments imply differing number of rows: #201

Closed sparcycram closed 8 years ago

sparcycram commented 8 years ago

Depending on the settings in gbm1 the error appears, n.trees by = 5 doesn't work by = 10 does?

# This r script is for GH simple model
# gives error Error in data.frame(interaction.depth = seq(1, 5, by = 1), n.trees = seq(10,  : 
#arguments imply differing number of rows: 5, 59, 1

# clear lists
rm(list = ls())

########################
#load Aids2 data here
library(MASS)
data("Aids2")
TRAIN <- as.data.frame(Aids2)
#############################################################

library('caret')
library('caretEnsemble')
library("gbm")
library("kernlab")
library("e1071")
library("class")
library("caTools")
library("plyr")
library("pamr")
library("cluster")
library("arm")
library("Matrix")
library("lme4")

set.seed(1234)
inTrain <- createDataPartition(y = TRAIN$status, p = .75, list = FALSE)
training <- TRAIN[ inTrain,]
testing <- TRAIN[-inTrain,]

MF<-createMultiFolds(training$status, k=2, times=5)

my_control <- trainControl(
  method='cv',
  savePredictions="final", 
  classProbs=TRUE,
  index=MF,
  summaryFunction= twoClassSummary )

set.seed(1234)
model_list_big <- caretList(
  status~., data=training,
  trControl=my_control,
  metric= "ROC",
  maximize=TRUE,
  methodList=c("pam","bayesglm","svmRadial"), 
  tuneList=list(
    gbm1=caretModelSpec(method="gbm", tuneGrid=data.frame(interaction.depth=seq(1,5, by=1),n.trees=seq(10,300, by=5),
                                                          shrinkage=0.1, n.minobsinnode = 10))
  )
)

model_list_big

########### end #######################

# > sessionInfo()
# R version 3.2.4 Revised (2016-03-16 r70336)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows Server >= 2012 x64 (build 9200)
# 
# locale:
#   [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
# [4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    
# 
# attached base packages:
#   [1] parallel  splines   stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] arm_1.8-6           lme4_1.1-11         Matrix_1.2-4        pamr_1.55           cluster_2.0.3       plyr_1.8.3          caTools_1.17.1     
# [8] class_7.3-14        e1071_1.6-7         kernlab_0.9-23      gbm_2.1.1           survival_2.38-3     caretEnsemble_2.0.0 caret_6.0-64       
# [15] ggplot2_2.1.0       lattice_0.20-33     MASS_7.3-45        
# 
# loaded via a namespace (and not attached):
#   [1] Rcpp_0.12.3        compiler_3.2.4     nloptr_1.0.4       bitops_1.0-6       iterators_1.0.8    tools_3.2.4        digest_0.6.9      
# [8] nlme_3.1-125       gtable_0.2.0       mgcv_1.8-12        foreach_1.4.3      SparseM_1.7        coda_0.18-1        gridExtra_2.2.1   
# [15] stringr_1.0.0      pROC_1.8           MatrixModels_0.4-1 stats4_3.2.4       grid_3.2.4         nnet_7.3-12        data.table_1.9.6  
# [22] pbapply_1.2-0      minqa_1.2.4        reshape2_1.4.1     car_2.1-1          magrittr_1.5       scales_0.4.0       codetools_0.2-14  
# [29] abind_1.4-3        pbkrtest_0.4-6     colorspace_1.2-6   quantreg_5.21      stringi_1.0-1      munsell_0.4.3      chron_2.3-47      
# > 
zachmayer commented 8 years ago

On a previous question, I asked you to make a minimal, reproducible example. Here is the minimal reproducible example for your problem:

data.frame(
  interaction.depth=seq(1,5, by=1),n.trees=seq(10,300, by=5),
  shrinkage=0.1, n.minobsinnode = 10)

Given this minimal example, can you figure out where you went wrong?

sparcycram commented 8 years ago

It appears as though length of n.trees has to match length of interaction.depth so the settings have to reconcile. A bit difficult to do in ones head:) Thanks

zachmayer commented 8 years ago

Yup! expand.grid is a really useful function in this situation, because it generates all possible combinations of the inputs, .e.g.

expand.grid(
  interaction.depth=seq(1,5, by=1),n.trees=seq(10,300, by=5),
  shrinkage=0.1, n.minobsinnode = 10)

But watch out! This can get big fast!

sparcycram commented 8 years ago

Thanks useful

Sent from my iPad

On 02 Apr 2016, at 00:24, Zach Mayer notifications@github.com wrote:

Yup! expand.grid is a really useful function in this situation, because it generates all possible combinations of the inputs, .e.g.

expand.grid( interaction.depth=seq(1,5, by=1),n.trees=seq(10,300, by=5), shrinkage=0.1, n.minobsinnode = 10) But watch out! This can get big fast!

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub