Closed Cogito2012 closed 1 year ago
Hi @Cogito2012, the focal scaling in our paper extends the key idea of original focal loss, that is, suppressing the confident predictions. In our transfer learning (using only base categories), the models tend to bias to those base categories, making their confidence scores high. Focal scaling makes our model less biased to these training samples and thus reduces the forgetting.
Hi, Thanks for sharing the great repo! Could you explain what are the differences between the focal scaling and the original focal loss?
From the implementation here: ./detectron2/modeling/roi_heads/fast_rcnn.py#L630, seems like it is a special case of focal loss that
alpha=1
andgamma=0.5
.Besides, why is the focal scaling effective in mitigating forgetting of concepts learned pertaining stage that contains a small number of categories?
Looking forward to your reply! Thank you!