mlr-org / mlr

Machine Learning in R
https://mlr.mlr-org.com
Other
1.64k stars 405 forks source link

ROC analysis for a regressor built with FeatSelWrapper #1946

Closed vrgaliano closed 7 years ago

vrgaliano commented 7 years ago

I am trying to apply a ROC analysis from a learner of class FeatSelWrapper:

rdesc = makeResampleDesc("Bootstrap", iters = 100, predict = "both") b632plus.mse = setAggregation(mse, b632plus) lrn.rf = makeLearner("regr.randomForest", ntree = 2500) ctrl = makeFeatSelControlSequential(method = "sfs", alpha = 0.001) lrn = makeFeatSelWrapper(lrn.rf, resampling = rdesc, measures = b632plus.mse, control = ctrl, show.info = TRUE) rpart.sfs = train(lrn, task = regr.task) df = generateThreshVsPerfData(rpart.sfs, measures = list(fpr, tpr, mmce))

larskotthoff commented 7 years ago

What is your question?

vrgaliano commented 7 years ago

I get the following error message:

Error in UseMethod("generateThreshVsPerfData") : no applicable method for 'generateThreshVsPerfData' applied to an object of class "c('FeatSelModel', 'BaseWrapperModel', 'WrappedModel')"

How could I perfom a ROC analysys with a learner obtained by feature selection wrapper?

larskotthoff commented 7 years ago

You can plot ROC curves from the predictions of the model. See the tutorial.

vrgaliano commented 7 years ago

Dear Lars, thanks for your reply. It seems that it is not supported for regression:

rdesc = makeResampleDesc("CV", iters = 10) pred_rf.sbs = predict(rf.sbs, task = regr.task, resampling = rdesc) df = generateThreshVsPerfData(pred_rf.sbs, measures = list(fpr, tpr, mmce))

Error in checkPrediction(obj, task.type = "classif", binary = TRUE, predict.type = "prob") : Prediction must be one of 'classif', but is: 'regr'

2017-07-26 19:02 GMT+02:00 Lars Kotthoff notifications@github.com:

You can plot ROC curves from the predictions of the model. See the tutorial https://mlr-org.github.io/mlr-tutorial/devel/html/roc_analysis/index.html .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mlr-org/mlr/issues/1946#issuecomment-318117866, or mute the thread https://github.com/notifications/unsubscribe-auth/AXzZ6c26HV8O6RE_u_5lIChG5hK8WMlRks5sR3EYgaJpZM4OkFr8 .

berndbischl commented 7 years ago

Prediction must be one of 'classif', but is: 'regr'

well, of, course you cannot do ROC for regression? thats a technique for binary classification?

vrgaliano commented 7 years ago

Well, it is not exactly regression either. I have a variable (nitrates concentration) and I am interested in those instances that are higher than a certain value (37.5). I rescaled the variable into 0s (<37.5) and 1s (instances >37.5) values. This way I am predicting values between the range of 0 to 1, which is somehow a probability. Perhaps this approach is not correct using mlr library, but I did something similar using e1071 library. Any suggestion on how to improve my analysis in mlr?. Thanks

Victor

2017-07-26 19:56 GMT+02:00 Bernd Bischl notifications@github.com:

Prediction must be one of 'classif', but is: 'regr'

well, of, course you cannot do ROC for regression? thats a technique for binary classification?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mlr-org/mlr/issues/1946#issuecomment-318132836, or mute the thread https://github.com/notifications/unsubscribe-auth/AXzZ6SkelLMBEGWvniVTYVBsOSurHpbPks5sR32_gaJpZM4OkFr8 .

larskotthoff commented 7 years ago

What are you actually investigating?

vrgaliano commented 7 years ago

The target is twofold: to obtain maps for the probability of being contaminated by nitrates, and to identify the drivers of this contamination via feature wrapper approach. I am thinking in modifying my analysis to perform binary classification instead of regression of a binary variable. My only concern is if I could predict both the hard class and the probability. I would like to perform feature selection wrapper in classification mode by using the AUC as a metric for optimizing the wrapper. I guess it makes sense but I am open to suggestions...

2017-07-26 20:30 GMT+02:00 Lars Kotthoff notifications@github.com:

What are you actually investigating?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mlr-org/mlr/issues/1946#issuecomment-318142061, or mute the thread https://github.com/notifications/unsubscribe-auth/AXzZ6b1FVZS2AGQQ284HzuMYb4Xt4Aa4ks5sR4WngaJpZM4OkFr8 .

larskotthoff commented 7 years ago

You can predict probabilities in mlr for classification. It sounds like this is what you should do.

berndbischl commented 7 years ago

i think we answered here