szilard / benchm-ml

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
MIT License
1.87k stars 335 forks source link

Question on the metric of AUC #56

Open kyhhdm opened 5 years ago

kyhhdm commented 5 years ago

It seems to be a little bit confused that the evaluation on classification tasks uses the probabilities output directly in calculating the AUC.
For example, in 6-xgboost.R#L39, Will it be better to do that with (phat>0.5)?

szilard commented 5 years ago

To calculate AUC you need the probability predictions.