模型效果评估：Accuracy, Precision, Recall & F1 Score

VinaTsai commented 3 years ago

TP, TN, FP, FN

True Positives (TP) - These are the correctly predicted positive values which means that the value of actual class is yes and the value of predicted class is also yes.

True Negatives (TN) - These are the correctly predicted negative values which means that the value of actual class is no and value of predicted class is also no.

False Positives (FP) – When actual class is no and predicted class is yes.

False Negatives (FN) – When actual class is yes but predicted class is no.

以上可知，positives and negatives 分别表示预测结果中的1和0 (or 1和-1)。

Accuracy, Precision, Recall and F1 score.

Accuracy - Accuracy is the most intuitive performance measure and it is simply a ratio of correctly predicted observation to the total observations.

Accuracy is a great measure but only when you have symmetric datasets where values of false positive and false negatives are almost same.

Accuracy = (TP+TN)/(TP+FP+FN+TN)

Precision - Precision is the ratio of correctly predicted positive observations to the total predicted positive observations.

High precision relates to a low false positive rate. Precision = TP/(TP+FP)

Recall (Sensitivity) - Recall is the ratio of correctly predicted positive observations to the all observations in actual class. The recall above 0.5 is good.

High recall relates to a low false negative rate. Recall = TP/(TP+FN)

F1 score - F1 Score is the weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account.

Intuitively it is not as easy to understand as accuracy, but F1 is usually more useful than accuracy, especially if you have an uneven class distribution. *Accuracy works best if false positives and false negatives have similar cost. If the cost of false positives and false negatives are very different, it’s better to look at both Precision and Recall.

F1 Score = 2 * (Recall * Precision) / (Recall + Precision)

Conclusion:

使用特征	指标
平衡数据	Accuracy
不平衡数据	F1 score

References:

https://blog.exsilio.com/all/accuracy-precision-recall-f1-score-interpretation-of-performance-measures/

VinaTsai commented 3 years ago

Precision Recall Curve

The precision-recall curve shows the tradeoff between precision and recall for different threshold. A high area under the curve represents both high recall and high precision, where high precision relates to a low false positive rate, and high recall relates to a low false negative rate. High scores for both show that the classifier is returning accurate results (high precision), as well as returning a majority of all positive results (high recall).

Precision-Recall is a useful measure of success of prediction when the classes are very imbalanced.

A system with high recall but low precision returns many results, but most of its predicted labels are incorrect when compared to the training labels. A system with high precision but low recall is just the opposite, returning very few results, but most of its predicted labels are correct when compared to the training labels. An ideal system with high precision and high recall will return many results, with all results labeled correctly.

https://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html

VinaTsai commented 3 years ago

Examples

https://blog.argcv.com/articles/1036.c

VinaTsai commented 3 years ago

Conclusion

precision表示预测为正样本的准确率（分母：预测值为正样本），分子和分母都取决于threshold
recall 表示所有正样本中被准确预测的概率（分母：真实值为正样本），分子取决于threshold，但分母与threshold无关
precision 和 recall 有一定相关性，但并不单调。
High precision & low recall: 预测为正样本中准确的占比高，但很少的真实正样本被准确预测
High recall& low precision: 很多的真实正样本被准确预测，但有更多被预测为正样本的真实值都是负样本

VinaTsai / xgboost_notebook

模型效果评估：Accuracy, Precision, Recall & F1 Score #11

TP, TN, FP, FN

Accuracy, Precision, Recall and F1 score.

Conclusion:

References:

Precision Recall Curve

Examples

Conclusion