zachmayer / caretEnsemble

caret models all the way down :turtle:
Other
226 stars 75 forks source link

Failing tests with update of pROC on xrobin/pROC@master #135

Closed xrobin closed 9 years ago

xrobin commented 9 years ago

I am testing a new version of pROC and found a regression in the automated tests of caretEnsemble:

R CMD check caretEnsemble_1.0.0.tar.gz
[...]
* checking tests ...
  Running ‘testthat.R’
 ERROR
Running the tests in ‘tests/testthat.R’ failed.
Last 13 lines of output:
                 970L, 972L, 973L, 974L, 977L, 980L, 984L, 986L, 988L, 990L, 991L, 993L, 
                 994L, 995L, 997L, 1000L)), .Names = c("Fold1", "Fold2", "Fold3")), indexOut = NULL, 
             timingSamps = 0, predictionBounds = c(FALSE, FALSE), seeds = NA, adaptive = structure(list(
                 min = 5, alpha = 0.05, method = "gls", complete = TRUE), .Names = c("min", 
             "alpha", "method", "complete")), allowParallel = TRUE), .Names = c("method", 
         "number", "repeats", "p", "initialWindow", "horizon", "fixedWindow", "verboseIter", 
         "returnData", "returnResamp", "savePredictions", "classProbs", "summaryFunction", 
         "selectionFunction", "preProcOptions", "index", "indexOut", "timingSamps", "predictionBounds", 
         "seeds", "adaptive", "allowParallel")), method = "knn")
  17: stop("final tuning parameters could not be determined")

  Error: Test failures
  Execution halted

I found the following reproducible example:

 library('caretEnsemble')
 train <- twoClassSim(
   n = 1000, intercept = -8, linearVars = 3,
   noiseVars = 10, corrVars = 4, corrValue = 0.6)
 test <- twoClassSim(
   n = 1500, intercept = -7, linearVars = 3,
   noiseVars = 10, corrVars = 4, corrValue = 0.6)

  myControl = trainControl(
    method = "cv", number = 3, repeats = 1,
    p = 0.75, savePrediction = TRUE,
    summaryFunction = twoClassSummary,
    classProbs = TRUE, returnResamp = "final",
    returnData = TRUE, verboseIter = FALSE)

    test1 <- caretList(
      x = train[, -23],
      y = train[, "Class"],
      metric = "ROC",
      trControl = myControl,
      methodList = c("knn", "glm")
    )

That results in the following error:

Error in train.default(x = list(TwoFactor1 = c(1.29053342712081, 1.49079633767954,  : 
  final tuning parameters could not be determined
In addition: Warning messages:
1: In trControlCheck(x = trControl, y = target) :
  indexes not defined in trControl.  Attempting to set them ourselves, so each model in the ensemble will have the same resampling indexes.
2: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,  :
  There were missing values in resampled performance measures.
3: In train.default(x = list(TwoFactor1 = c(1.29053342712081, 1.49079633767954,  :
  missing values found in aggregated results
Stack trace: 
7:stop("final tuning parameters could not be determined")
6:train.default(x = list(TwoFactor1 = c(1.29053342712081, 1.49079633767954, ...
5:(function (x, ...) ...
4:do.call(train, model_args)
3:FUN(X[[1L]], ...)
2:lapply(tuneList, function(m) {...
1:caretList(x = train[, -23], y = train[, "Class"], metric = "ROC", ...

With pROC 1.7.3 currently on CRAN only the first warning indexes not defined in trControl is generated.

I am a bit confused here, it seems the error is actually raised in the caret package, is this correct. I couldn't unravel what call to pROC resulted in the missing values: there shouldn't be any difference unless one ask for partial AUC.

I would appreciate any pointer as to which pROC command generates those missing values.

zachmayer commented 9 years ago

Can you re-check with the latest version of both packages? Thanks.

xrobin commented 9 years ago

Right now I cannot build your package (github version):

xavier@iMac:~/projects/pROC_test$ R CMD build caretEnsemble_github/
* checking for file ‘caretEnsemble_github/DESCRIPTION’ ... OK
* preparing ‘caretEnsemble’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
Warning in engine$weave(file, quiet = quiet) :
  Pandoc (>= 1.12.3) and/or pandoc-citeproc is not available. Please install both.
Warning in train.default(x, y, weights = w, ...) :
  The metric "Accuracy" was not in the result set. ROC will be used instead.
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,  :
  There were missing values in resampled performance measures.
Quitting from lines 49-56 (caretEnsemble-intro.Rmd) 
Error: processing vignette 'caretEnsemble-intro.Rmd' failed with diagnostics:
Stopping
Execution halted
xrobin commented 9 years ago

I updated a bunch of other outdated packages and everything works fine now. Sorry about the false alarm.