Closed liquor233 closed 4 years ago
Hi! Thanks for your interest in our paper. In issue #1, I mentioned that the temperature scaling was indeed implemented earlier but eventually was partially removed for this version (and it should have been competely). To be clear: this repo is not an earlier version of our work; the experiments were done using this code. I realize I did not make myself clear.
About PCE and SPCE experiments, they were implemented my @mboudiaf on a private clone as they were done on MNIST so they required some code changes. Maybe you can add a few comments @mboudiaf ?
Hi @liquor233, Thank you for your interest in our paper. PCE is only used to make our point on the theoretical link between cross-entropy and pairwise losses. We did not implement PCE as it requires computation of eigenvalues of a poorly conditioned matrix at every iteration, which isn't practical at all. Instead, SPCE can be seen as a sub-optimal but practical version of PCE, and was implemented on MNIST, once again only to empirically support the link between cross-entropy and pairwise losses. Results from section 6.4 are solely based on CE loss.
Now I understand, I just got shocked that use CE loss solely can achieve such a good result. Thank you both very much!
Hi jeromerony, I like your paper very much, thanks for your wonderful work. I noticed that you only implemented cross entropy in this repo, neither PCE loss nor SPCE loss can be found here, and you also mentioned this is an early version in another issue. To make clear, I am curious that have you really implemented PCE loss? or you just use this loss to demonstrate your theory? is the result in section 6.4 merely from CE loss or it is a result of SPCE loss?
My english is poor, sorry to bother you.