Open y78h11b09 opened 4 years ago
The clamp function probably improves stability in some cases but is very much unnecessary as you can switch to using the "with logits" version of the focal loss as is used in the tensorflow version of the official code (quote below from the official code's comments)
# Below are comments/derivations for computing modulator.
For brevity, let x = logits, z = targets, r = gamma, and p_t = sigmod(x)
for positive samples and 1 - sigmoid(x) for negative examples.
#
The modulator, defined as (1 - P_t)^r, is a critical part in focal loss
computation. For r > 0, it puts more weights on hard examples, and less
weights on easier ones. However if it is directly computed as (1 - P_t)^r,
its back-propagation is not stable when r < 1. The implementation here
resolves the issue.
#
For positive samples (labels being 1),
(1 - p_t)^r
= (1 - sigmoid(x))^r
= (1 - (1 / (1 + exp(-x))))^r
= (exp(-x) / (1 + exp(-x)))^r
= exp(log((exp(-x) / (1 + exp(-x)))^r))
= exp(r log(exp(-x)) - r log(1 + exp(-x)))
= exp(- r x - r log(1 + exp(-x)))
#
For negative samples (labels being 0),
(1 - p_t)^r
= (sigmoid(x))^r
= (1 / (1 + exp(-x)))^r
= exp(log((1 / (1 + exp(-x)))^r))
= exp(-r * log(1 + exp(-x)))
#
Therefore one unified form for positive (z = 1) and negative (z = 0)
samples is:
(1 - p_t)^r = exp(-r z x - r * log(1 + exp(-x))).
Good job, thank you very much , i will try it Have a nice day to you
y78h11b09@163.com
From: rmcavoy Date: 2020-03-26 02:37 To: toandaominh1997/EfficientDet.Pytorch CC: y78h11b09; Author Subject: Re: [toandaominh1997/EfficientDet.Pytorch] It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss (#131) The clamp function probably improves stability in some cases but is very much unnecessary as you can switch to using the "with logits" version of the focal loss as is used in the tensorflow version of the official code (quote below from the official code's comments)
#
#
#
#
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Recently, I found that efficientDet-do didn't work for 1600 objects because cls_loss still didn't downgrade in training.
So , i modified the funtion **"torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" into "torch.clamp(classification, min=1e-8, max=1.0 - 1e-8)" in focol_loss,
and then efficientDet-d0 can be traind on for 1600 objects. i Who can tell me what advantage is the funtion of torch.clamp() in focal_loss? i think it should be removed completely!