Do we have a way to calculate AUC for binary classifier?

amplab / keystone

Simplifying robust end-to-end machine learning on Apache Spark.

Apache License 2.0

470 stars 117 forks source link

The binary classifier we have right now is a simple class that doesn't do things which need multiple passes over the data. I think the easiest thing to do if you need more complex metrics is to feed the output from Keystone's model to the BinaryClassificationMetrics in Spark. The code would look something like

val testActual = ... // Create actual labels as an RDD[Double] 
val predictor = ... andThen NaiveBayesEstimator(...)
val predictions = predictor(testData).get // This is RDD[Double] with score for each example
val metrics = new org.apache.spark.mllib.evaluation.BinaryClassificationMetrics(
    predictions.zip(testActual))

amplab / keystone

Do we have a way to calculate AUC for binary classifier? #288