Open lukk47 opened 5 years ago
@LokLu It is the same.
So the two loss functions are the same.
Let me know if I am wrong.
@kleinzcy Your equations are correct. But the problem is that the 'pred' in your code are the logits output of the model instead of the softmax of pred.
@kleinzcy Your equations are correct. But the problem is that the 'pred' in your code are the logits output of the model instead of the softmax of pred.
Hi, just saw this and I am curious as well. But the criterion should ideally take the logits output instead of softmax of pred right?
The mixup loss function in code is as below: , while the mixture should be down before feed into the loss function according to the paper.
Will these two loss functions have the same results?