Closed milliema closed 11 months ago
Thank you for the interest on our work.
Exactly. loss_con is used to train the backbone model, and loss_con_T is used for the temperature parameter (T_c). A detailed explanation is provided in Section 4.9 of our paper (https://openreview.net/pdf?id=KqNX6VOqnJ).
We use a non-self mask to train T_c, excluding the self-augmented sample when estimating T_c, i.e., the variance of the Gaussian kernel. Including the self-augmented sample results in a significantly low T_c, as it is highly correlated with the input.
Since we use a cosine similarity classifier (normed linear classifier), the scale of the output logit is significantly smaller than that of a linear classifier. Therefore, we need to scale up the logits using a small temperature. We followed the hyperparameter setting of previous long-tailed recognition literature (https://github.com/FlamieZhu/Balanced-Contrastive-Learning/blob/main/models/resnext.py).
Thanks for the awesome work! I have several questions related to the code:
Thanks a lot!