Closed YanWang2014 closed 6 years ago
Hi Yan,
For both models of confusion, the preferred usage is softmax probabilities (obtained after feeding the logits through a softmax). You can, alternatively, try the logits for pairwise confusion, but the loss weight will have to be scaled to a very small value to prevent oscillations, and hence we recommend operating on the softmax probabilities themselves.
Abhi
Thank you!
@abhimanyudubey
In the Entropy confusion (https://github.com/abhimanyudubey/confusion/blob/master/confusion_pytorch/__init__.py#L15), it seems that you missed a negative sign according to the paper. isn't this a loss function?
In the paper,
So for the PairwiseConfusion, we are using logits? which is the direct output of pytorch models.
But for EntropicConfusion, obviously we should use softmax probabilities, which is obtained by feeding logits through a softmax function.
Am I right? Thank you