Closed jcdubron closed 3 years ago
"If not, how is the performance?": Table 9 in our paper shows on ATSS that the performance is similar (39.8 w/o self-balancing vs 39.9 with self-balancing) and using this strategy reduces the number of hyperparameters to be tuned. Note that to obtain 39.8, we tuned task-balancing scalar to 2 (c.f. Table A.11 below), but with self-balancing, there is no need for tuning.
"What's the intuition of this action?": This is a simple heuristic to discard tuning task-balancing coefficients. We analysed equalizing values and gradients (see Table A.11 below) and observed that when we use losses with similar ranges (RS Loss for classification, GIoU Loss for box regression and Dice Loss for mask prediction - see also Fig. 3 in our paper), value-based approach performs as well as tuning.
" And is there any other reference to do so?" Previously in our NeurIPS 20 paper (aLRP Loss - https://arxiv.org/abs/2009.13592), we also used a self-balancing strategy. It was a bit different: For example, in aLRP Loss self-balancing was epoch-based, but in RS Loss it is iteration-based. Overall, the strategy is simpler in RS Loss. I don't remember any other detection/segmentation paper to use this kind of balancing strategy or design/analyse losses with bounded & similar ranges in all sub-tasks (i.e. classification, box regression, mask prediction).
Thanks for your detailed explanation. The comprehensive experiments validate the effectiveness of this design.
I notice that loss_bbox is weighted to be equal to the sum of loss_rank and loss_sort. If not, how is the performance? What's the intuition of this action? And is there any other reference to do so? https://github.com/kemaloksuz/RankSortLoss/blob/b41f64df70b1f8e3260e5c04015e0c7b59b339d9/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py#L287-L289