xingyizhou / CenterNet2

Two-stage CenterNet
Apache License 2.0
1.2k stars 188 forks source link

What are the difference Custom ROI head layers? #23

Closed lxtGH closed 3 years ago

lxtGH commented 3 years ago

📚 Documentation

Hi! Again. Thanks for opensouring the code. I wonder the main differences between "CustomCascadeROIHeads" "CustomFastRCNNOutputLayers" and original "CascadeROIHeads" and "FastRCNNOutputLayers" ?

What are the results of using original modules by replacing RPN with CenterNet ?

xingyizhou commented 3 years ago

Hi, Thank you for reading the code. Multiplying the proposal probability to the final detection score (during testing) is implemented in the custom heads. See the paper for the results without this. Training with the original implementation is fine for COCO. Also, the custom output layer contains the implementation of the Federated loss for LVIS.

lxtGH commented 3 years ago

Hi! Thank for your quick answer. I will check the details of the paper. In fpn_p5.py "build_p67_resnet_fpn_backbone" seems like the same with origin retinanet backbone "build_retinanet_resnet_fpn_backbone". Is my understanding correct ?

xingyizhou commented 3 years ago

No. The difference is build_retinanet_resnet_fpn_backbone generates p6 from c5, while build_p67_resnet_fpn_backbone generates p6 from p5. This is discussed in the FCOS paper and we followed this. In my experiments, they work similarly in performance, but p67 has slight fewer FLOPs (256 channel in p5 vs. 2048 or 1024 channel in c5).

lxtGH commented 3 years ago

Hi! Thanks for your quick answer. It is the FCOS setting. In tab5, "a stronger proposal network and incorporating the proposal score." I don't undertand why only strong single detector can improve the preformance? It is a very useful conclusion for inference to improve the accuracy. Can you give some extra explaination?

lxtGH commented 3 years ago

Hi, Thank you for reading the code. Multiplying the proposal probability to the final detection score (during testing) is implemented in the custom heads. See the paper for the results without this. Training with the original implementation is fine for COCO. Also, the custom output layer contains the implementation of the Federated loss for LVIS.

Again, I verify this in my codebase. Just multilplying the score can lead to 1map gain. It is very amazing!

xingyizhou commented 3 years ago

why only strong single detector can improve the performance?

Are you asking why multiplying the proposal score does not improve the original RPN? The intuitive answer would be the score in RPN is not accurate enough to help. More specifically, RPN set a low negative threshold, which encourages many (false) positive boxes. This design meets the purpose of reaching a high recall of the RPN. However, in our analysis, we conclude only having a high recall in the first stage does not fit our target, and we need both precision and recall in the first stage. This is what a complete one-stage detector does. Table. 4 in our paper provides a more detailed roadmap about how we make the original RPN closer to a one-stage detector (RetinaNet), by adding more layers and changing the threshold/ losses.

Hope that helps. Feel free to followup if you have more questions.

lxtGH commented 3 years ago

Hi! Thanks for your reply !!!!! Best wish for this paper!