Albert0147 / AaD_SFDA

Code for our NeurIPS 2022 (spotlight) paper 'Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation'
61 stars 6 forks source link

It seems the code of loss part does not match the paper #2

Closed theim4you closed 1 year ago

theim4you commented 1 year ago

Hi Shiqi,

Thanks you very much for providing code and congratulations on your paper acceptance!

After reading the loss function part, I could not find how the code reflects Eq.5 of paper. Please correct me if I miss anything:

First, the first term of Eq.5 is consistency between local neighbours. I understand you use a memory bank to save softmax outputs, and then get K neighbours.

https://github.com/Albert0147/AaD_SFDA/blob/9a4c8bf9bfb6ab0800be55163c82d8ee71e7e6be/tar_adaptation.py#L316

However, I cannot find the connection between KlD loss and the first term in Eq.5

https://github.com/Albert0147/AaD_SFDA/blob/9a4c8bf9bfb6ab0800be55163c82d8ee71e7e6be/tar_adaptation.py#L320

This line of code does not equal to the first term of Eq.5. I am confused by this, please help me solve this.

Second, for the second term, it is to disperse the prediction of potential dissimilar features. However, your code does not reflect this thing.

https://github.com/Albert0147/AaD_SFDA/blob/9a4c8bf9bfb6ab0800be55163c82d8ee71e7e6be/tar_adaptation.py#L330

Given a test sample, this line of code regards the rest of samples in the batch as background. This is also not equal to the definition of background set in the paper.

This work claims that "provide a surprisingly simple solution for source-free domain adaptation, which is an upperbound of the proposed clustering objective". Therefore, I expect the code corresponds to equation.

Please help me address the concern, and correct my if I misunderstand anything.

Albert0147 commented 1 year ago

Hi there.

  1. Please check the pytorch document of F.kl_div (be careful whether there is log or not), optimizing this one equals to operating on dot product.

  2. Please read our paper carefully, we already mention we approximate the whole dataset by the mini-batch, which is in the upper bound derivation section.

Albert0147 commented 1 year ago

I think my reply is fairly clear, thus I close this issue.