Closed theim4you closed 1 year ago
Hi there.
Please check the pytorch document of F.kl_div (be careful whether there is log or not), optimizing this one equals to operating on dot product.
Please read our paper carefully, we already mention we approximate the whole dataset by the mini-batch, which is in the upper bound derivation section.
I think my reply is fairly clear, thus I close this issue.
Hi Shiqi,
Thanks you very much for providing code and congratulations on your paper acceptance!
After reading the loss function part, I could not find how the code reflects Eq.5 of paper. Please correct me if I miss anything:
First, the first term of Eq.5 is consistency between local neighbours. I understand you use a memory bank to save softmax outputs, and then get K neighbours.
https://github.com/Albert0147/AaD_SFDA/blob/9a4c8bf9bfb6ab0800be55163c82d8ee71e7e6be/tar_adaptation.py#L316
However, I cannot find the connection between KlD loss and the first term in Eq.5
https://github.com/Albert0147/AaD_SFDA/blob/9a4c8bf9bfb6ab0800be55163c82d8ee71e7e6be/tar_adaptation.py#L320
This line of code does not equal to the first term of Eq.5. I am confused by this, please help me solve this.
Second, for the second term, it is to disperse the prediction of potential dissimilar features. However, your code does not reflect this thing.
https://github.com/Albert0147/AaD_SFDA/blob/9a4c8bf9bfb6ab0800be55163c82d8ee71e7e6be/tar_adaptation.py#L330
Given a test sample, this line of code regards the rest of samples in the batch as background. This is also not equal to the definition of background set in the paper.
This work claims that "provide a surprisingly simple solution for source-free domain adaptation, which is an upperbound of the proposed clustering objective". Therefore, I expect the code corresponds to equation.
Please help me address the concern, and correct my if I misunderstand anything.