tim-learn / SHOT

code released for our ICML 2020 paper "Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation"
MIT License
437 stars 78 forks source link

Question about Information Maximization loss #19

Closed saltoricristiano closed 3 years ago

saltoricristiano commented 3 years ago

Hi all,

thank you very much for sharing the code of your very interesting work! I'm actually facing an issue in relating what reported in your papers in equation (3)-ICLR and equation (3)-TPAMI. My doubt is: in your equation you report that L_im is obtained with the weighted sum of the entropy loss and divergence loss:

Screenshot 2021-05-31 at 17 24 56

Instead, what you do in your code here https://github.com/tim-learn/SHOT/blob/7cebb390194215823b435b0723c7b342ae62b42b/object/image_target.py#L205 is to subtract the CE loss to the entropy loss. I'll try to be clearer.Following your paper I was expecting total_loss = entropy_loss + beta * gentropy_loss while following what you do here https://github.com/tim-learn/SHOT/blob/7cebb390194215823b435b0723c7b342ae62b42b/object/image_target.py#L205 the resulting equation will be total_loss = entropy_loss - beta * gentropy_loss resulting in a minimization of the entropy while maximising the divergence component. Is there anything I'm misunderstanding?

Thank you in advance for any help!

tim-learn commented 3 years ago

Hi all,

thank you very much for sharing the code of your very interesting work! I'm actually facing an issue in relating what reported in your papers in equation (3)-ICLR and equation (3)-TPAMI. My doubt is: in your equation you report that L_im is obtained with the weighted sum of the entropy loss and divergence loss:

Screenshot 2021-05-31 at 17 24 56

Instead, what you do in your code here https://github.com/tim-learn/SHOT/blob/7cebb390194215823b435b0723c7b342ae62b42b/object/image_target.py#L205 is to subtract the CE loss to the entropy loss. I'll try to be clearer.Following your paper I was expecting total_loss = entropy_loss + beta * gentropy_loss while following what you do here https://github.com/tim-learn/SHOT/blob/7cebb390194215823b435b0723c7b342ae62b42b/object/image_target.py#L205 the resulting equation will be total_loss = entropy_loss - beta * gentropy_loss resulting in a minimization of the entropy while maximising the divergence component. Is there anything I'm misunderstanding?

Thank you in advance for any help!

Hi, the entropy loss has a minus symbol compared with that in L_div. The code is consistent with Eq.(3).

saltoricristiano commented 3 years ago

Hi @tim-learn ,

thx for your answer, indeed I didn't see that you were computing the CE loss and then subtracting it. Closing the issue!