Wrong implementation of focal loss

AlanChou commented 4 years ago

Hi,

I believe that you have a wrong implementation of focal loss. I hope I have not misunderstood the code. Although the wrong implementation of focal loss will not effect the method you proposed. I hope the authors will spend some time correcting it.

You should compute -(1-p)^r * log(p) for every sample in the batch. However, after you use F.cross_entropy at line 21 of losses.py , the output is already a single "value". You then use this value as p to compute focal loss which is completely wrong.

An obvious indication of the wrong implementation is that you can actually remove the .mean() at line 11 in losses.py without causing any errors. It shows that you're indeed dealing with a single value but not vectors.

This might explain why your implementation is so different from https://github.com/Hsuxu/Loss_ToolBox-PyTorch/blob/master/FocalLoss/FocalLoss.py or https://github.com/clcarwin/focal_loss_pytorch/blob/master/focalloss.py

You can also check the previous work you've cited https://github.com/vandit15/Class-balanced-loss-pytorch/blob/master/class_balanced_loss.py where the key point is that they make sure "reduction=none" when using F.binary_cross_entropy_with_logits.

kaidic commented 4 years ago

Thanks for your notice. I do have the reduction='none' flag in my codebase for the paper's experiments. You are welcome to check if the results in the paper are right of course. I managed to clean up the codebase for this github release but I didn't re-run all the experiments. Apparently I made a mistake when cleaning things up.

AlanChou commented 4 years ago

The reported results for focal loss seem reasonable to me. I believe you have it right in your codebase.

Thanks for contributing such great work. I look forward to the code of iNaturalist experiment!

kaidic / LDAM-DRW

Wrong implementation of focal loss #6