About the bias value for position head.

dvlab-research / PanopticFCN

Fully Convolutional Networks for Panoptic Segmentation (CVPR2021 Oral)

Apache License 2.0

391 stars 53 forks source link

About the bias value for position head. #6

Closed GengDavid closed 3 years ago

GengDavid commented 3 years ago

Hi @yanwei-li , Thanks for sharing your great work. I found that you set POSITION_HEAD.THING.BIAS_VALUE = -2.19. Is any specific reason to set -2.19 as the bias value? It looks like a magic number, and when I set it to 0, training loss gets to NaN after about 28k iterations (it is weird and I'm not sure it occurs from the randomness or the different bias valud). Thx.

yanwei-li commented 3 years ago

Thanks for this good question. Actually, we follow the same initial methods in Focal loss (see Sec. 4.1), which is also adopted in CornerNet and CenterNet. In particular, the bias is initialized with b = − log((1 − π)/π), where π indicates that 'every point should be labeled as foreground with conﬁdence of ∼ π'. And the confidence of π is set to 0.1 in our paper. Thus, b = − log((1 − 0.1)/0.1)=-2.19.

GengDavid commented 3 years ago

Got it! Thx. But it makes me wonder why you also use Focal loss for stuff position prediction? And the bias value is not set.

yanwei-li commented 3 years ago

Actually, Focal loss is adopted to give things and stuff unified supervision. For the reason that it does not contain the so-called 'positive point' in the position head for stuff (or all the stuff points can be viewed as positive), there is no need to specifically set bias value for stuff positions.

GengDavid commented 3 years ago

Cool 👍, thanks for your reply.