Open yxsysu opened 3 months ago
This is a question of the calculation method for contrastive learning. The traditional version of Moco uses cross entropy to implement the calculation. Its core is to construct a label (target). In our method, we also follow the manner of Moco. The label uses two one-hot, sum of them satisfies 1. Then bi is equal to 1 and ai is less than 1, achieving a similar regularization effect.
Thank you very much for sharing your work. However, I have some questions regarding the loss function part in the DIFO paper and code. Specifically, the calculation method of the L_MCE loss function proposed in the paper seems to differ from the code implementation. Could you please provide a more detailed explanation?
Thank you very much for sharing your work. However, I have some questions regarding the loss function part in the DIFO paper and code. Specifically, the calculation method of the L_MCE loss function proposed in the paper seems to differ from the code implementation. Could you please provide a more detailed explanation?
Please refer to our previous answer: This is a question of the calculation method for contrastive learning. The traditional version of Moco uses cross entropy to implement the calculation. Its core is to construct a label (target). In our method, we also follow the manner of Moco. The label uses two one-hot, sum of them satisfies 1. Then bi is equal to 1 and ai is less than 1, achieving a similar regularization effect.
Thank you for your patient reply. I understand. I am looking at Moco's calculation method.
Could you please tell me on which line of code this MCE loss is? I only find the IID loss which is about a joint mutual information maximization. https://github.com/tntek/source-free-domain-adaptation/blob/deb4b1faebd72e911e265a705ef57ac1de301fd6/src/methods/oh/difo.py#L243
Could you please tell me on which line of code this MCE loss is? I only find the IID loss which is about a joint mutual information maximization.
line 235-239
Hi, thanks for your great work. I have a question after reading the paper and the code. It seems that the loss L_MCE is computed at lines 180-184 in src/methods/net/difo.py, but it does not exactly match the form in the paper. Could you give me some hints? Thanks for your reply.