Hi, thank you very much for releasing this codebase -- it has been very useful to my project.
I'm wondering if there is a mistake in the Focal Loss implementation. The code in loss/focal.py first calculates CrossEntropy loss, averages it over all samples, and then applies a modulating factor loss = (1 - p) ** self.gamma * logp. If I understand the original Focal Loss paper correctly, they propose to calculate CrossEntropy, apply the modulating factor, and only then average the result over all samples.
I wonder, is the order changed on purpose in this repository? I think this way it might be losing the idea of Focal loss...
Hi, thank you very much for releasing this codebase -- it has been very useful to my project.
I'm wondering if there is a mistake in the Focal Loss implementation. The code in
loss/focal.py
first calculates CrossEntropy loss, averages it over all samples, and then applies a modulating factorloss = (1 - p) ** self.gamma * logp
. If I understand the original Focal Loss paper correctly, they propose to calculate CrossEntropy, apply the modulating factor, and only then average the result over all samples. I wonder, is the order changed on purpose in this repository? I think this way it might be losing the idea of Focal loss...If it's actually a mistake, a simple fix in the line https://github.com/ZhaoJ9014/face.evoLVe/blob/63520924167efb9ef53dcceed0a15cf739cad1c9/loss/focal.py#L13
to
self.ce = nn.CrossEntropyLoss(reduction='none')
will suffice.Other implementations also seem to be having a different order, e.g. see 1 and 2.