Shouldn't it be log_sigmoid instead of softmax ?

clcarwin / focal_loss_pytorch

A PyTorch Implementation of Focal Loss.

MIT License

953 stars 226 forks source link

Shouldn't it be log_sigmoid instead of softmax ? #7

Open karanchahal opened 5 years ago

karanchahal commented 5 years ago

The paper mentions that the loss layer is combined with the sigmoid computation and not softmax. More speciafically this line

Finally,
we note that the implementation of the loss layer combines
the sigmoid operation for computing p with the loss computation, resulting in greater numerical stability.

So isn't the author saying that we should use sigmoid activation over the last layer. The softmax usage maybe could lead to a lower accuracy.

etienne87 commented 5 years ago

actually is there any benchmark for using focal softmax or binary cross-entropy?