Closed studerus closed 7 years ago
I noticed that your version of auc is still about one third slower than the one I recently implemented in mlr. One bottleneck seems to be this line:
auc
mlr
if(class(actual) %in% c('factor', 'character')){ actual = as.numeric(as.factor(as.character(actual))) - 1 }
I suggest to replace it with
if (inherits(actual, 'factor')) { actual <- as.integer(actual) - 1L } else if (inherits(actual, 'character')) { actual <- as.integer(as.factor(actual)) - 1L }
See the following benchmark:
library(ModelMetrics) library(mlr) library(microbenchmark) library(data.table) x <- c('Pos', 'Neg') actual <- sample(factor(x, x), 50000, replace = T) predicted <- runif(length(actual)) auc3_ <- ModelMetrics:::auc3_ binaryChecks <- ModelMetrics:::binaryChecks auc2 <- function(actual, predicted, ...) { binaryChecks(actual, 'auc') if (inherits(actual, 'factor')) { actual <- as.integer(actual) - 1L } else if (inherits(actual, 'character')) { actual <- as.integer(as.factor(actual)) - 1L } if(length(actual > 10000)){ ranks = frankv(predicted) AUC <- ModelMetrics:::auc3_(actual, predicted, ranks) } else { AUC <- auc_(actual, predicted, ranks) } return(AUC) } microbenchmark(mlr = measureAUC(predicted, actual, positive = 'Pos'), modelmetrics = auc(actual, predicted), modelmetrics.improved = auc2(actual, predicted))
Unit: milliseconds expr min lq mean median uq max neval cld mlr 4.148332 4.236479 4.454990 4.305307 4.445979 6.128918 100 a modelmetrics 6.395471 6.496297 6.840453 6.631536 6.790019 9.250582 100 c modelmetrics.improved 4.480996 4.534126 4.836400 4.623480 4.722645 6.802999 100 b
Thank you!
I noticed that your version of
auc
is still about one third slower than the one I recently implemented inmlr
. One bottleneck seems to be this line:I suggest to replace it with
See the following benchmark: