Closed remind0 closed 2 years ago
The F1 score reported in the paper is not the regular micro F1 for 4-class classification. We followed the same way to calculate it as in the EMNLP 2019 paper by Qiang Ning et al. (see Appendix A on page 5). And here’s the link to their paper: https://arxiv.org/pdf/1909.00429.pdf
I find the code about the calculating of precision, recall, and f1 in metric.py maybe wrong. In your way, the calculation results of precision and recall are the same, except that the divisor of precision contains CM[3][0:3].sum(). The correct code might be: P = 1.0 (CM[0][0] + CM[1][1] + CM[2][2]) / (CM.sum(axis=0)[0:3].sum()) R = 1.0 (CM[0][0] + CM[1][1] + CM[2][2]) / (CM.sum(axis=1)[0:3].sum()) I want to know if it looks like this, thanks.