Different mixup loss function between code and paper

facebookresearch / mixup-cifar10

mixup: Beyond Empirical Risk Minimization

Other

1.16k stars 225 forks source link

Different mixup loss function between code and paper #18

Open lukk47 opened 5 years ago

lukk47 commented 5 years ago

The mixup loss function in code is as below: , while the mixture should be down before feed into the loss function according to the paper.

Will these two loss functions have the same results?

kleinzcy commented 4 years ago

@LokLu It is the same.

So the two loss functions are the same.

Let me know if I am wrong.

lukk47 commented 4 years ago

@kleinzcy Your equations are correct. But the problem is that the 'pred' in your code are the logits output of the model instead of the softmax of pred.

lizc126 commented 3 years ago

@kleinzcy Your equations are correct. But the problem is that the 'pred' in your code are the logits output of the model instead of the softmax of pred.

Hi, just saw this and I am curious as well. But the criterion should ideally take the logits output instead of softmax of pred right?