Closed drcege closed 5 years ago
Hello @drcege , thanks for asking. We followed this paper: https://arxiv.org/abs/1511.06233 for the F-measure calculation. It was modified from the original equation for better openset evaluation.
@zhmiao @drcege I also feel confused on this issue, I suggest authors @zhmiao can explain it clearly rather than re-direct readers to refer other literature since this criterion is so important for this manuscript.
@zhmiao I further seek related information from https://arxiv.org/abs/1511.06233,however, I got nothing explain, may you pay some attention to explain this issue?
@drcege hi, have you understand this issue, I have the same confusing feeling as you, may you share you newest understanding?
Searching for "open-set f-measure" I just found this thread. Here are my thoughts on this issue. Many works on open-set recognition use f-measure but the authors do not specify how the metric is calculated. Sometimes it seems to me that the conclusions presented in some works on open-set recognition are just non-reliable due to the metric employed. How to be sure that the employed metric is measuring a better open-set behavior? It also seems that some authors do not mind if the metric is really capturing a better behavior of the classifier. I do not know details of the work associated to this repository, anyhow, I agree with the concerns raised by @drcege. If any of you are interested in evaluation metrics specially proposed for open-set recognition, take a look in a paper in which we talk about this issue [1]. There in Section 4.1 we propose the open-set f-measure and the normalized accuracy, both to be employed on open-set problems.
[1] Mendes Júnior, P. R. and de Souza, R. M. and Werneck, R. de O. and Stein, B. V. and Pazinato, D. V. and de Almeida, W. R. and Penatti, O. A. B. and Torres, R. da S. and Rocha, A. de R. (2017). "Nearest Neighbors Distance Ratio Open-Set Classifier". Machine Learning, 106, 359–386. https://doi.org/10.1007/s10994-016-5610-8.
Hi, I wonder if true positive, false positive and false negative are counted correctly. https://github.com/zhmiao/OpenLongTailRecognition-OLTR/blob/4a1f4009921b1c99029bfda151915058ff086a51/utils.py#L86-L89 Here are some examples according to the above code: (pairs of prediction and label)
I'm confused about