PeizeSun / OneNet

[ICML2021] What Makes for End-to-End Object Detection
MIT License
650 stars 75 forks source link

Why does OneSeg performance drop? #26

Closed Dwrety closed 2 years ago

Dwrety commented 2 years ago

Very interesting work! I have read your implementation on the CondInst with OneNet matching. I've noticed there is a significant drop in mask AP compared to the original CondInst. What could be the causes?

Is it because a single positive sample per instance is not enough to train the mask branch? (I see you have doubled mask loss weight.) Or is it because of the adamw optimizer? Any further digging on this issue?

There is a second question. In the paper, it is described that all anchor box/points are used in cost calculation. Have you tried first using hand-crafted assignment (e.g. fcos) method and then matching? In other word, do the assignment results from matching still satisfy those hand-crafted methods? Love to hear from you. Thank you.

PeizeSun commented 2 years ago

Hi~

  1. OneSeg is just a naive combination of OneNet and CondInst, we don't dig out details.
  2. I guess the assignment results from matching is a little different from hand-crafted methods. For example in human detection(figure9 in our paper), the assignment results of positive samples are inside heads, but the hand-crafted positive samples are in the body parts.
Dwrety commented 2 years ago

Interesting. Thanks for your answer, I reimplemented your OneNet in MMdet framework and got a better result with 34+ mAP in mask. After bring mask cost into the equation, I was able to get around 35.6 mAP in mask. I think the mask branch learns well enough, the cap in performance is mostly because of classification error, and this could be because of underfitting with only 1 anchor for each GT.