hyz-xmaster / VarifocalNet

VarifocalNet: An IoU-aware Dense Object Detector
Apache License 2.0
346 stars 52 forks source link

Question about Varifocal loss #8

Open HAOCHENYE opened 3 years ago

HAOCHENYE commented 3 years ago

In the paper, the negtive weight of BCE loss is *alphap^gamma**. However, in varifocal_loss.py, the loss is implemented by:

focal_weight = target (target > 0.0).float() + \ alpha (pred_sigmoid - target).abs().pow(gamma) * \ (target <= 0.0).float()

The negtive weight is *alpha(p-q)^gamma**, why?

hyz-xmaster commented 3 years ago

This is the initial version of implementation of VFL and I forgot to refine it. alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float() actually equals to alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float(), because there is a multiplier (target <= 0.0).float() in that formula and the target is always >= 0.

HAOCHENYE commented 3 years ago

This is the initial version of implementation of VFL and I forgot to refine it. alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float() actually equals to alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float(), because there is a multiplier (target <= 0.0).float() in that formula and the target is always >= 0.

You means alpha * pred_sigmoid.abs().pow(gamma) * (target <= 0.0).float() equals alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float() or alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float() equals to alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()? I'd understand the situation if it is the former one.

According to paper, the negtive weight should be alpha * pred_sigmoid.abs().pow(gamma) * (target <= 0.0).float().Is the formular of paper current version?

hyz-xmaster commented 3 years ago

Hi, target is the IoU so it is always >= 0, which implies target <= 0 <=> target == 0.
In this way, alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float() <=> alpha * (pred_sigmoid - target).abs().pow(gamma) * (target == 0.0).float() <=> alpha * pred_sigmoid.abs().pow(gamma) * (target == 0.0).float().

HAOCHENYE commented 3 years ago

Ohhh! Thanks, I understand it now.

feiyuhuahuo commented 3 years ago

Screenshot from 2021-01-05 10-31-48 图片 Hi @hyz-xmaster ,

  1. I did not find the q in the red circle according to the code.
  2. I can't understand the item above the green line. Since log(1-p) is used to predict negative samples, why it appears in the q>0 case? And anyway, I did not find the related implementation from the code. I just understand the code by the following way: Screenshot from 2021-01-05 10-41-02 Looking forward to your reply, thanks.
hyz-xmaster commented 3 years ago

Hi @feiyuhuahuo,

  1. target in the code represents q in that formula.
  2. qlog(p)+(1-q)log(1-p) is the binary cross entropy loss, which is calculated by F.binary_cross_entropy_with_logits. When q = 0, qlog(p)+(1-q)log(1-p) reduces to log(1-p). When q > 0, it keeps unchanged.
yxx-byte commented 2 years ago

This is the initial version of implementation of VFL and I forgot to refine it. alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float() actually equals to alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float(), because there is a multiplier (target <= 0.0).float() in that formula and the target is always >= 0.

You means alpha * pred_sigmoid.abs().pow(gamma) * (target <= 0.0).float() equals alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float() or alpha * (pred_sigmoid - target).abs().pow(gamma) * (target <= 0.0).float() equals to alpha * pred_sigmoid.pow(gamma) * (target == 0.0).float()? I'd understand the situation if it is the former one.

According to paper, the negtive weight should be alpha * pred_sigmoid.abs().pow(gamma) * (target <= 0.0).float().Is the formular of paper current version?

Hello, did you add your loss to yolov5? Judge which place needs to be adjusted?

image