the implementation of losses

jeromerony / dml_cross_entropy

Code for the paper "A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses" (ECCV 2020 - Spotlight)

https://arxiv.org/abs/2003.08983

BSD 3-Clause "New" or "Revised" License

167 stars 18 forks source link

the implementation of losses #2

Closed liquor233 closed 4 years ago

liquor233 commented 4 years ago

Hi jeromerony, I like your paper very much, thanks for your wonderful work. I noticed that you only implemented cross entropy in this repo, neither PCE loss nor SPCE loss can be found here, and you also mentioned this is an early version in another issue. To make clear, I am curious that have you really implemented PCE loss? or you just use this loss to demonstrate your theory? is the result in section 6.4 merely from CE loss or it is a result of SPCE loss?

My english is poor, sorry to bother you.

jeromerony commented 4 years ago

Hi! Thanks for your interest in our paper. In issue #1, I mentioned that the temperature scaling was indeed implemented earlier but eventually was partially removed for this version (and it should have been competely). To be clear: this repo is not an earlier version of our work; the experiments were done using this code. I realize I did not make myself clear.

About PCE and SPCE experiments, they were implemented my @mboudiaf on a private clone as they were done on MNIST so they required some code changes. Maybe you can add a few comments @mboudiaf ?

mboudiaf commented 4 years ago

Hi @liquor233, Thank you for your interest in our paper. PCE is only used to make our point on the theoretical link between cross-entropy and pairwise losses. We did not implement PCE as it requires computation of eigenvalues of a poorly conditioned matrix at every iteration, which isn't practical at all. Instead, SPCE can be seen as a sub-optimal but practical version of PCE, and was implemented on MNIST, once again only to empirically support the link between cross-entropy and pairwise losses. Results from section 6.4 are solely based on CE loss.

liquor233 commented 4 years ago

Now I understand, I just got shocked that use CE loss solely can achieve such a good result. Thank you both very much!