Closed shouhengyi-microsoft closed 7 years ago
The binary classifier we have right now is a simple class that doesn't do things which need multiple passes over the data. I think the easiest thing to do if you need more complex metrics is to feed the output from Keystone's model to the BinaryClassificationMetrics in Spark. The code would look something like
val testActual = ... // Create actual labels as an RDD[Double]
val predictor = ... andThen NaiveBayesEstimator(...)
val predictions = predictor(testData).get // This is RDD[Double] with score for each example
val metrics = new org.apache.spark.mllib.evaluation.BinaryClassificationMetrics(
predictions.zip(testActual))
Hi all,
I've been reading BinaryClassificationMetrics [http://keystone-ml.org/api/latest/#evaluation.BinaryClassificationMetrics], but the AUC is missing. I'm wondering what is the best thing I can do if I want to calculate AUC for binary classification problems.
Thanks.