Closed qsisi closed 2 months ago
Maybe it's because the first one is smoother than the second one?
let pred==gt, the balanced_ce_loss is not equal to 0: which can be explained by the above curves, that the computed loss will not equal zero.
Hi @qsisi, that's a great question, thank you! Yes, this loss is indeed always greater than zero. We've run some experiments and found out that the model converges better with this loss, but we're trying to fix this for the next version.
Hello! @nikitakaraevv
In the balanced CE loss implementations: https://github.com/facebookresearch/co-tracker/blob/9ed05317b794cd177674e681321780614a65e073/cotracker/models/core/cotracker/losses.py#L14-L38 For simplicity, assume a 1-d scenario where gt = [1, 0], pred=[p0, p1]∈(0,1), then as computed above: pos = [1, 0] neg = [0, 1] label = [1, -1] a = [-p0, p1] b = [0, p1] loss1 = b + log(exp(-b) + exp(a-b)) = [log(1+exp(-p0)), p1 + log(1+exp(-p1))] My question is, why not just use this cross-entropy computation: then the loss will be like: loss2 = [-log(p0), -log(1-p1)] ? What is the difference between these two loss computations? Is the first one better than the second one?