Closed randomjohn closed 9 years ago
Eliminating one of the classes of the factor still keeps the same levels:
> levels(iris2$Species)
[1] "setosa" "versicolor" "virginica"
and ROC curves require a factor with two levels. Reset them via:
> iris2$Species <- factor(as.character(iris2$Species))
> levels(iris2$Species)
[1] "setosa" "virginica"
and mod1
works fine. The warning Can't have empty classes in y.
doesn't do a good job helping figure that out.
Max
I'm not sure that this is the appropriate issue to mention this, since it seems that the problem above is from improperly specifying factors; but I'm getting the same error message:
Full details here http://stackoverflow.com/questions/33088893/caret-random-forests-not-working-something-is-wrong-all-the-accuracy-metric
I've added a check in the new version to make sure that the outcome is a factor with non-zero frequencies. You can relevel the outcome prior to fitting the model.
For predictors, I'm in the process of added options of removing zero- and near-zero variance predictors to preProcess
.
As I mentioned, I'm not sure this is the appropriate issue to comment in, since my issue isn't really the same as the above--it's just the same error message.
@topepo My outcome has no levels with non-zero frequencies and none of my predictors have zero / near-zero variances. It's possible that that is the case in the test data that I have on SO, I'll double check and update if so.
Updated: So, in my test
data, variable "x7" does have a very small variance. But, if I remove this variable (test$x7 <- NULL
), I still get the same errors.
I tested alex's problem. The error is related to missing values in the resampled performance measures. Interestingly, if you change the method "cforest" or "parRF" to "rf" it works in parallel.
I also tested the "cforest" without running it in parallel and then it works. It looks like the parallel option causes some conflict with the building of the resampled performance measures when using the methods "cforest" or "parRF"
For Alex's problem, here is the answer that I posted on SO:
When I run the first cforest
model, I can see that "In addition: There were 31 warnings (use warnings() to see them)"
. These say that
unused arguments (verbose = FALSE, proximity = FALSE, importance = TRUE)
These are arguments to the randomForest
function and not cforest
. Removing them removes the errors.
I have updated the SO post. I'm still getting errors
I received the same error message when running gbm in caret. I finally corrected the problem by removing the allowParallel=TRUE argument from the train() function.
Hello, I am having the below error working locally on my laptop: Error in train.default(x, y, weights = w, ...) : The tuning parameter grid should have columns trials, model, winnow. I had caret version 6.0.72. It didn't work. I deleted it and installed version 6.0.35. Still having same errors. Any thought please?
I got the error message when running the below command:
set.seed(300)
m <- train(as.factor(default) ~ ., data = credit_resample, method = "C5.0", metric = "kappa", trControl = ctrl, tuneGrid = grid)
Thanks
@ptagne we can't really do much without a small reproducible example and the results of sessionInfo
.
I downloaded caret version 6.0.35 and replaced caret version 6.0.73 and it worked. Thanks, Pascal.
On Wed, Nov 16, 2016 at 1:17 PM, Max Kuhn notifications@github.com wrote:
@ptagne https://github.com/ptagne we can't really do much without a small reproducible example and the results of sessionInfo.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/topepo/caret/issues/160#issuecomment-261043343, or mute the thread https://github.com/notifications/unsubscribe-auth/ATGZeB8wKwywJTXGTFyuWuvNT8qyGijjks5q-1bEgaJpZM4Ew9g2 .
"Clouds do not always mean rain but smoke is a sure sign of fire". A proverb from Djibouti.
How to downgrade caret version
@pverspeelt tip worked fine for me.
I had similar problem. In my case I was using rfeControl(functions=caretFunctions for a binary logistic regression. It got resolved when I changed that to rfeControl(functions=lrFuncs
I am working through your applied predictive modeling text, I am working on problems 12.3.c and I am having an issue where I am getting this error for ROC or accuracy on the churn data. however I am using library(modeldata) data("mlc_churn")
as CS50 is no more
I did the following, and I am suspecting that there might be some linear combinations going on, but I am really new to that concept, and the trim function did not discard any values for the numerical predictors
I've tried changing the integer values by cutting them in to various levels and I am doing my best to not have any zero-frequency classes, this data is really imbalance and that posed a challenge. I was having the issue just leaving the integers as integers as well.
This is the splits I had for converting them to factors
Using LRA with glm or multinomial gets around 83% accuracy , and ROC of 0.85
I'm using, ctrl <- trainControl(method = "LGOCV", summaryFunction = twoClassSummary, classProbs = TRUE, savePredictions = TRUE)
and split the data like so,
Churn<-mlc_churn$churn training= createDataPartition(Churn, p = .8, list= FALSE) trainPreds<-preds[training,] testPreds<-preds[-training,] trainChurn <- Churn[training] testChurn <- Churn[-training]
All other models in chapter 12 are throwing this error, and I can't figure out what I am doing wrong. No issues on problems 12.1 or 12.2, so I must be missing something with pre-processing
I hope that is enough detail, to be pointed in the right direction.
example of error
set.seed(1)
plsChurn <- train(x = trainPreds,
y = trainChurn,
method = "pls",
tuneGrid = expand.grid(.ncomp = 1:15),
preProc = c("center","scale"),
metric = "ROC",
probMethod = "Bayes",
trControl = ctrl)
Something is wrong; all the ROC metric values are missing:
Error: Stopping
In addition: There were 50 or more warnings (use warnings() to see the first 50)
50!
Similar for Linear Discriminant Analysis When I try using lda(x=trainPredsPreProcess, grouping=trainChurn), it builds the model, however when to predict on it, I get the following error: Error in FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...) : non-numeric argument to binary operator
I have done, from a clean R 3.2.0 (x64 Windows 8) installation:
Gives me:
This is as about a barebones of an example I can give, but basically now caret is unusable as a classification tool because of the above.