Closed ypriverol closed 8 years ago
packageVersion('caret')
remove.packages(c('caret'))
library(devtools)
install_version(package='caret',version='6.0-52')
run again and report ;)
@cafernandezlo I fixed the error by using:
mod <- getModelInfo("svmRadial", regex = FALSE)[[1]]
mod$predict <- function(modelFit, newdata, submodels = NULL) {
svmPred <- function(obj, x) {
hasPM <- !is.null(unlist(obj@prob.model))
if(hasPM) {
pred <- lev(obj)[apply(predict(obj, x, type = "probabilities"), 1, which.max)]
} else pred <- predict(obj, x)
pred
}
out <- try(svmPred(modelFit, newdata), silent = TRUE)
if(is.character(lev(modelFit))) {
if(class(out)[1] == "try-error") {
warning("kernlab class prediction calculations failed; returning NAs")
out <- rep("", nrow(newdata))
out[seq(along = out)] <- NA
}
} else {
if(class(out)[1] == "try-error") {
warning("kernlab prediction calculations failed; returning NAs")
out <- rep(NA, nrow(newdata))
}
}
if(is.matrix(out)) out <- out[,1]
out
}
#Support Vector Machine Object
svmProfileValue <- rfe(trainDescr,trainClass, sizes = (1:4),rfeControl = rfeControl(functions =
caretFuncs,number = numberIter, verbose = TRUE),method = mod);
But I can try your solution.
@ypriverol it is not a real solution, it's simply a downgrade to avoid the problem. We found a similar problem with our RRgress package https://github.com/enanomapper/RRegrs and the last version of Caret package.
Thanks @cafernandezlo I will try.
The solution from @ypriverol was recommended by me.
In the previous version of caret
, extractPrediction
took care if it and in newer versions, we avoid that function in predict.train
. Interestingly, extractPrediction
was not designed to convert the matrix down to a vector. That change in output type appears to happened in kernlab
after the original version of the function. Fortunately, the code
pred <- c(pred, tempUnkPred)
unknowingly solve the issue. Something similar occurred with the earth
package last year.
I think that the best solution is to modify the model objects to return a vector (since that is what we want). Eventually, train
will be modified to return vector valued predictions and modifying predictionFunction
to fix this bug will be undone.
@topepo @cafernandezlo @enriquea Hi guys I have been facing a problem, when I finish the training and get my final model using my current hack in caret. My predict function always retrieve one value. See the following code:
newData<- data.frame(bjell=4, calibrated=4.9, expasy=4.5) svmModel <- svmModel predict(svmModel, newdata=newData)
Well, that is what I would expect:
> newData<- data.frame(bjell=4, calibrated=4.9, expasy=4.5)
> nrow(newData)
[1] 1
No really understand
@topepo sorry didn't express in a proper way. If I change:
newData<- data.frame(bjell=4, calibrated=4.9, expasy=4.5) svmModel <- svmModel predict(svmModel, newdata=newData)
for
newData<- data.frame(bjell=4, calibrated=10, expasy=4.5) svmModel <- svmModel predict(svmModel, newdata=newData)
it gives me the same value even when the varaibles change
Hi all, Running it, I hoped for a vector with different values in pIs but I get the same. Any ideas?
dframe <- data.frame(calibrated, bjell, expasy) dframe calibrated bjell expasy 1 4.5 5.4 6.8 2 4.9 5.6 6.0 3 5.1 5.9 7.1 pIs <- predict(object = svmModel, newdata=dframe) pIs [1] 6.417835 6.417835 6.417835
I am using the "predict" function from Kernlab package.
I am using the "predict" function from Kernlab package.
I think that there might be things happening that you are not showing (like how svmModel
was created). That's why we always want a small, reproducible example.
It is rare that you should generate a model using train
(or rfe
or others) and use the original predict code. train
does things that the underlying model object may not know about (e.g. pre-processing). You should not expect to get the same/right answer by doing so.
Here an example training the svm classifier:
load("C:/Users/Enrique/Git/pIR-master/data/svmPeptideData.rda")
peptides_properties <- subset(data, select=c("bjell", "expasy", "calibrated","aaindex"))
peptides_experimental <- subset(data, select=c("pIExp"))
svmModel <- svmProfile(dfExp = peptides_experimental, dfProp = peptides_properties, method = method, numberIter = numberIter)`
The svmProfile function looks like this:
svmProfile <- function(dfExp, dfProp, method = "svmRadial", numberIter = 2){
#load Data
# This is the data file with the descriptors:
peptides_desc <- as.matrix(dfProp);
# This is the Data File with the Experimental Isoelectric Point
peptides_class <- as.matrix(dfExp);
#Scale and center data
peptides_desc <- scale(peptides_desc,center=TRUE,scale=TRUE);
#Divide the dataset in train and test sets
# Create an index of the number to train
inTrain <- createDataPartition(peptides_class, p = 3/4, list = FALSE)[,1];
#Create the Training Dataset for Descriptors
trainDescr <- peptides_desc[inTrain,];
# Create the Testing dataset for Descriptors
testDescr <- peptides_desc[-inTrain,];
trainClass <- peptides_class[inTrain];
testClass <- peptides_class[-inTrain];
mod <- getModelInfo("svmRadial", regex = FALSE)[[1]]
mod$predict <- function(modelFit, newdata, submodels = NULL) {
svmPred <- function(obj, x) {
hasPM <- !is.null(unlist(obj@prob.model))
if(hasPM) {
pred <- lev(obj)[apply(predict(obj, x, type = "probabilities"), 1, which.max)]
} else pred <- predict(obj, x)
pred
}
out <- try(svmPred(modelFit, newdata), silent = TRUE)
if(is.character(lev(modelFit))) {
if(class(out)[1] == "try-error") {
warning("kernlab class prediction calculations failed; returning NAs")
out <- rep("", nrow(newdata))
out[seq(along = out)] <- NA
}
} else {
if(class(out)[1] == "try-error") {
warning("kernlab prediction calculations failed; returning NAs")
out <- rep(NA, nrow(newdata))
}
}
if(is.matrix(out)) out <- out[,1]
out
}
#Support Vector Machine Object
svmProfileValue <- rfe(trainDescr, trainClass, sizes = (1:4),rfeControl = rfeControl(functions = caretFuncs,number = numberIter, verbose = TRUE),method = mod);
return (svmProfileValue)
svmModel
looks like it should of class train
. Why did you say that you were using the kernlab
predict function?
Also, I don't know anything about your data. If the predictors are on different metrics, you really should be centering and scaling. Otherwise the predictor with the largest values will dominate the dot product and this could very well be why you are getting the same value predicted.
To predict new values using a new dataset I am using the following code:
svmModel <- svmProfile()
pIs <- predict(object = svmModel, newdata=dframe)
Is that correct?
@topepo looks like the problem can be related with the scale @enriquea and myselft will test that. BTW if we use the same scale function from caret to scale the new values .. what will happen?
So if you use the preProc
argument to train
, it will:
predict
method is used on a train
object. Does that address your question?
Ok. Can you point to any example where the preProc is used
Hi @ypriverol and @topepo:
The problem was fixed by applying the same transformation to the new data that the training data set using a function preProcess. Thank you a lot for your collaboration and time.
@enriquea thanks I will close the issue. +1
@topepo
svmProfile <- rfe(x, logBBB, sizes = c(2, 5, 10, 20), rfeControl = rfeControl(functions = caretFuncs, number = 3, verbose = TRUE),method = "svmRadial") and it fail with the same error.
Error in { : task 1 failed - "undefined columns selected"