Closed matifr closed 5 years ago
This line is getting a single area under the ROC curve pooling across resamples (2304 total rows of data).
ROC = plot.roc(rfe_rf$pred$obs[selectedIndices],
rfe_rf$pred$neg[selectedIndices], legacy.axes = TRUE)
This line is taking the average of the 30 resampled area under the ROC curve estimates.
mean(rfe_rf$resample$ROC[which(rfe_rf$resample$Variables == 8)])
They should be "close" but would only the same in rare circumstances (e.g. balanced data over folds and a linear performance metric).
Hi, I want to generate ROC curves using the training data and resample results from the rfe function for the optimal subset size. I have managed to do this with the code below but there is some inconsistency between the mean ROC value that caret calculated and the one that I calculate with the proc package, i.e caret ROC = 0.8307 and pROC ROC = 0.8287
I cant figure out why this is happening, is this a bug in my code or the packages calculate ROC in a different way?
On my own dataset (not shown here), the difference is bigger i.e. caret-ROC 92.2%, pROC-ROC 89.15%.
Thanks a lot in advance! Matina
Minimal, runnable code:
I can reproduce the ROC given by caret by averaging the ROC values from each resample for the optimal subset.