Open XiongZhongxia opened 2 years ago
Actually, I also implement a light-weight model (which nearly has 10% FLOPS of your baseline), with multispectral inputs. Finally it only gets 62.7 AP but achieves 4.3 MR. Is this reasonable?
A very interesting and valuable question. The following is my superficial understanding of this issue, there may be something wrong, and corrections are welcome.
According to section 3.1 of the paper Pedestrian Detection: An Evaluation of the State of the Art: 'We use the log-average miss rate to summarize detector performance, computed by averaging miss rate at nine FPPI rates evenly spaced in log-space in the range 10^(-2) to 10^0', the metric log average miss rate measures the recall ability of the model: the lower the metric, the stronger the recall ability. However, the metric AP is calculated from the precision-recall curve, it measures the precision of the model. In fact, we hope the model to have both high precision and high recall, but in most cases, precision and recall are negatively correlated. This may be the reason why some low AP models report lower MR(higher recall).
It seems that most pedestrian detection tasks tend to use MR as a metric: https://paperswithcode.com/task/pedestrian-detection
A very interesting and valuable question. The following is my superficial understanding of this issue, there may be something wrong, and corrections are welcome. According to section 3.1 of the paper Pedestrian Detection: An Evaluation of the State of the Art: 'We use the log-average miss rate to summarize detector performance, computed by averaging miss rate at nine FPPI rates evenly spaced in log-space in the range 10^(-2) to 10^0', the metric log average miss rate measures the recall ability of the model: the lower the metric, the stronger the recall ability. However, the metric AP is calculated from the precision-recall curve, it measures the precision of the model. In fact, we hope the model to have both high precision and high recall, but in most cases, precision and recall are negatively correlated. This may be the reason why some low AP models report lower MR(higher recall). It seems that most pedestrian detection tasks tend to use MR as a metric: https://paperswithcode.com/task/pedestrian-detection
- Sorry, we have not tried experiments using fused images for object detection.
Thanks for your inspring answer. Here's a another discussion about AP and MR: https://patrick-llgc.github.io/Learning-Deep-Learning/paper_notes/ap_mr.html. According to your explanation and this discussion, it may be preferable to demonstrate results of both metrics, luckily you have provided them for subsequent researches.
Thanks for your contribution!