Closed Mosquito00 closed 8 years ago
Don't use twoClassSummary
for regression.
Hello zachmayer,
Thank you for your answer. I deleted the twoClassSummary, but the code still doesn´t work.
I got the following error:
Error: x$control$savePredictions ist nicht TRUE
Add savePredictions=TRUE
to your trainControl.
My trainControl is:
myControl = trainControl(method='cv', summaryFunction=twoClassSummary, number = folds, repeats = repeats, classProbs=TRUE, savePredictions=TRUE, index=createMultiFolds(Y[train], k=folds, times=repeats))
savePredictions was already set to TRUE... Still, it did not work.
Did you run the code? Does it work on your computer?
Thank you.
Please provide a minimal reproducible example I can copy/paste into a fresh r session and replicate the error:
http://stackoverflow.com/a/5963610
Sent from my iPhone
On Jan 1, 2016, at 7:35 AM, Mosquito00 notifications@github.com wrote:
My trainControl is:
myControl = trainControl(method='cv', summaryFunction=twoClassSummary, number = folds, repeats = repeats, classProbs=TRUE, savePredictions=TRUE, index=createMultiFolds(Y[train], k=folds, times=repeats))
savePredictions was already set to TRUE... Still, it did not work.
Did you run the code? Does it work on your computer?
Thank you.
— Reply to this email directly or view it on GitHub.
Hello zachmayer,
I already provided the code above...
Here is the code:
set.seed(40)
library(caret) library(devtools) library(caretEnsemble)
library(mlbench) data(BostonHousing2)
X = model.matrix(cmedv~crim+zn+indus+chas+nox+rm+age+dis+ rad+tax+ptratio+b+lstat+lat+lon, BostonHousing2)[,-1] X = data.frame(X)
Y = BostonHousing2$cmedv
train = runif(nrow(X)) <= .66
folds=5 repeats=1
fold cross-validations are used as the resampling scheme. myControl = trainControl(method='cv', summaryFunction=twoClassSummary, number = folds, repeats = repeats, classProbs=TRUE, savePredictions=TRUE, index=createMultiFolds(Y[train], k=folds, times=repeats))
PP = c('center', 'scale')
names(all.models) = sapply(all.models, function(x) x$method) sort(sapply(all.models, function(x) min(x$results$RMSE)))
regression, elastic net regression, or greedy optimization. print(all.models)
greedy = caretEnsemble(all.models, iter=1000L) print(greedy) sort(greedy$weights, decreasing=TRUE) greedy$error
Thank you.
How many lines of this script can you remove while still getting the error?
Sent from my iPhone
On Jan 2, 2016, at 4:23 AM, Mosquito00 notifications@github.com wrote:
Hello zachmayer,
I already provided the code above...
Here is the code:
set.seed(40)
library(caret) library(devtools) library(caretEnsemble)
Data
library(mlbench) data(BostonHousing2)
X = model.matrix(cmedv~crim+zn+indus+chas+nox+rm+age+dis+ rad+tax+ptratio+b+lstat+lat+lon, BostonHousing2)[,-1] X = data.frame(X)
Y = BostonHousing2$cmedv
train = runif(nrow(X)) <= .66
folds=5 repeats=1
fold cross-validations are used as the resampling scheme. myControl = trainControl(method='cv', summaryFunction=twoClassSummary, number = folds, repeats = repeats, classProbs=TRUE, savePredictions=TRUE, index=createMultiFolds(Y[train], k=folds, times=repeats))
PP = c('center', 'scale')
names(all.models) = sapply(all.models, function(x) x$method) sort(sapply(all.models, function(x) min(x$results$RMSE)))
regression, elastic net regression, or greedy optimization. print(all.models)
greedy = caretEnsemble(all.models, iter=1000L) print(greedy) sort(greedy$weights, decreasing=TRUE) greedy$error
Thank you.
— Reply to this email directly or view it on GitHub.
Actually, this is the shortest version of my code. There must be something wrong with the
myControl = trainControl(method='cv', summaryFunction=twoClassSummary, number = folds, repeats = repeats, classProbs=TRUE, savePredictions=TRUE, index=createMultiFolds(Y[train], k=folds, times=repeats))
function or the caretEnsemble():
greedy = caretEnsemble(all.models, iter=1000L)
Did you also get the same error?
Is this a regression or classification problem? If it's regression, remove the twoClassSummary bit.
Sent from my iPhone
On Jan 2, 2016, at 11:02 AM, Mosquito00 notifications@github.com wrote:
Actually, this is the shortest version of my code. There must be something wrong with the
myControl = trainControl(method='cv', summaryFunction=twoClassSummary, number = folds, repeats = repeats, classProbs=TRUE, savePredictions=TRUE, index=createMultiFolds(Y[train], k=folds, times=repeats))
function or the caretEnsemble():
greedy = caretEnsemble(all.models, iter=1000L)
Did you also get the same error?
— Reply to this email directly or view it on GitHub.
It is a regression problem...
set.seed(40) library(caret) library(devtools) library(caretEnsemble)
library(mlbench) data(BostonHousing2)
X = model.matrix(cmedv~crim+zn+indus+chas+nox+rm+age+dis+ rad+tax+ptratio+b+lstat+lat+lon, BostonHousing2)[,-1] X = data.frame(X) Y = BostonHousing2$cmedv
train = runif(nrow(X)) <= .66 folds=5 repeats=1
myControl = trainControl(method='cv', number = folds, repeats = repeats, classProbs=TRUE, savePredictions=TRUE, index=createMultiFolds(Y[train], k=folds, times=repeats))
PP = c('center', 'scale')
all.models = caretList(X[train,], Y[train], trControl=myControl,methodList=c('gbm', 'blackboost'))
names(all.models) = sapply(all.models, function(x) x$method)
sort(sapply(all.models, function(x) min(x$results$RMSE)))
greedy = caretEnsemble(all.models, iter=1000L) print(greedy) sort(greedy$weights, decreasing=TRUE) greedy$error
This is my code.. I removed twoClassSummary and it still gives back the same error.
Thank you.
Yup, this is a bug. all.models[[1]]$control$savePredictions
is "all"
.
You can downgrade your version of caret, as savePredictions can now be "all", "best" or "none" I think. I'll fix this in the next release of caretEnsemble.
Thank you for your answer.
As I am new in R, I do not know how to downgrade the version of caret. Do you have any suggestions?
Mabe there is another possibility to perform a stacked regression? Do you incidentally know another package for stacked regression? Or a possibility how to optimize the weights in a stacked regression?
Thank you very much in advance.
This should be fixed now.
Fix was here: https://github.com/zachmayer/caretEnsemble/pull/185
Dear all,
I would like to use the code from zachmayer. Unfortunately, I get an error in the following line:
"greedy = caretEnsemble(all.models, iter=1000L)"
Error: x$control$savePredictions ist nicht TRUE
However, I tried to solve the issue in the following line:
"all.models = caretList(X[train,], Y[train], trControl=myControl,methodList=c('gbm', 'blackboost'))"
This line produces also a warning message and says:
Error in sensitivity.default(data[, "pred"], data[, "obs"], lev[1]) : inputs must be factors In addition: Warning message: In train.default(list(crim = c(0.03237, 0.06905, 0.02985, 0.08829, : cannnot compute class probabilities for regression
Therefore, I thought that the error lies in the second line and I tried to convert X[train,] and Y[train] into factors, but this did not work out as I assumed.
The code is the following:
set.seed(40)
library(caret) library(devtools) library(caretEnsemble)
Data
library(mlbench) data(BostonHousing2)
X = model.matrix(cmedv~crim+zn+indus+chas+nox+rm+age+dis+ rad+tax+ptratio+b+lstat+lat+lon, BostonHousing2)[,-1] X = data.frame(X)
Y = BostonHousing2$cmedv
train = runif(nrow(X)) <= .66
folds=5 repeats=1
fold cross-validations are used as the resampling scheme. myControl = trainControl(method='cv', summaryFunction=twoClassSummary, number = folds, repeats = repeats, classProbs=TRUE, savePredictions=TRUE, index=createMultiFolds(Y[train], k=folds, times=repeats))
PP = c('center', 'scale')
names(all.models) = sapply(all.models, function(x) x$method) sort(sapply(all.models, function(x) min(x$results$RMSE)))
regression, elastic net regression, or greedy optimization. print(all.models)
greedy = caretEnsemble(all.models, iter=1000L) print(greedy) sort(greedy$weights, decreasing=TRUE) greedy$error
Any help would be appreciated.